基于PicoDet的无人机视角VisDrone目标检测与ncnn部署
基于PicoDet实现对VisDrone数据的目标检测,尝试在无人机视角下的检测小目标,并基于ncnn部署。
1. 项目简介
本项目基于PicoDet实现对无人机视角的VisDrone数据集进行目标检测并实现在ncnn中的推理。
1.1 PicoDet简介
PicoDet是百度新推出的轻量级目标检测网络,对anchor-free策略在轻量型目标检测模型中的应用进行了探索,通过对backbone、neck、标签分配策略以及训练方法等诸多优化,在精度-效率上取得了更好的均衡。
PicoDet-S仅需0.99M参数即可取得30.6%mAP,比YOLOX-Nano高4.8%同时推理延迟降低55%,比NanoDet指标高7.1%;
当输入尺寸为320时,在移动端ARM CPU上可以达到123FPS处理速度,推理框架为PaddleLite时,推理速度可达150FPS。
PicoDet-M仅需2.15M参数即可取得34.3%mAP指标;
PicoDet-L仅需3.3M参数即可取得40.9%mAP,比YOLOv5s高3.7%mAP,推理速度快44%。
与其他轻量级检测网络对比如下:
论文:https://arxiv.org/abs/2111.00902
代码:https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet
1.2 VisDrone数据介绍
配备摄像机的无人机或通用无人机具有广泛的应用,包括农业,航空摄影,快速交付和监视等。
VisDrone数据集是由天津大学等团队开源的一个大型无人机视角的数据集,官方提供的数据中训练集是6471、验证集是548,一共提供了以下11个类,分别是:‘pedestrian’, ‘people’, ‘bicycle’, ‘car’, ‘van’,‘truck’, ‘tricycle’, ‘awning-tricycle’, ‘bus’, ‘motor’, ‘others’,其中others是非有效目标区域,本项目中予以忽略;
数据集主页:http://aiskyeye.com/challenge/object-detection/
数据集标注展示:
2. PicoDet算法介绍
官方的讲解非常详细,PPT参考如下;
2.1 PicoDet的算法特色
2.2 PicoDet的backbone
2.3 PicoDet的算法结构
2.4 SimOTA的特点
2.5 其他优化策略
3. 数据准备
PaddleDetection默认的是coco格式,而VisDrone有自己的标注格式,因此需要转换;
3.1 解压数据集
!mkdir work/data
!unzip -oq data/data115729/VisDrone2019-DET-train.zip -d work/data
!unzip -oq data/data115729/VisDrone2019-DET-val.zip -d work/data
3.2 转换数据格式为Coco格式
import json
import os
import cv2
import numpy as np
from PIL import Image
import shutil
class Vis2COCO:
def __init__(self, save_path, train_ratio, category_list, is_mode="train"):
self.category_list = category_list
self.images = []
self.annotations = []
self.categories = []
self.img_id = 0
self.ann_id = 0
self.is_mode = is_mode
self.train_ratio = train_ratio
self.save_path = save_path
if not os.path.exists(self.save_path):
os.makedirs(self.save_path)
def to_coco(self, anno_dir, img_dir):
self._init_categories()
img_list = os.listdir(img_dir)
for img_name in img_list:
anno_path = os.path.join(anno_dir, img_name.replace(os.path.splitext(img_name)[-1], '.txt'))
if not os.path.isfile(anno_path):
print('File is not exist!', anno_path)
continue
img_path = os.path.join(img_dir, img_name)
img = cv2.imread(img_path)
h, w, c = img.shape
self.images.append(self._image(img_path, h, w))
if self.img_id % 500 == 0:
print("处理到第{}张图片".format(self.img_id))
with open(anno_path, 'r') as f:
for lineStr in f.readlines():
try:
if ',' in lineStr:
xmin, ymin, w, h, score, category, trunc, occlusion = lineStr.split(',')
else:
xmin, ymin, w, h, score, category, trunc, occlusion = lineStr.split()
except:
# print('error: ', anno_path, 'line: ', lineStr)
continue
if int(category) in [0, 11] or int(w) < 4 or int(h) < 4:
continue
label, bbox = int(category), [int(xmin), int(ymin), int(w), int(h)]
annotation = self._annotation(label, bbox)
self.annotations.append(annotation)
self.ann_id += 1
self.img_id += 1
instance = {}
instance['info'] = 'VisDrone'
instance['license'] = ['none']
instance['images'] = self.images
instance['annotations'] = self.annotations
instance['categories'] = self.categories
return instance
def _init_categories(self):
cls_num = len(self.category_list)
for v in range(1, cls_num + 1):
#print(v)
category = {}
category['id'] = v
category['name'] = self.category_list[v - 1]
category['supercategory'] = self.category_list[v - 1]
self.categories.append(category)
def _image(self, path, h, w):
image = {}
image['height'] = h
image['width'] = w
image['id'] = self.img_id
image['file_name'] = os.path.basename(path)
return image
def _annotation(self, label, bbox):
area = bbox[2] * bbox[3]
annotation = {}
annotation['id'] = self.ann_id
annotation['image_id'] = self.img_id
annotation['category_id'] = label
annotation['segmentation'] = []
annotation['bbox'] = bbox
annotation['iscrowd'] = 0
annotation["ignore"] = 0
annotation['area'] = area
return annotation
def save_coco_json(self, instance, save_path):
import json
with open(save_path, 'w') as fp:
json.dump(instance, fp, indent=4, separators=(',', ': '))
def checkPath(path):
if not os.path.exists(path):
os.makedirs(path)
def cvt_vis2coco(img_path, anno_path, save_path, train_ratio=0.9, category_list=[], mode='train'): # mode: train or val
vis2coco = Vis2COCO(save_path, train_ratio, category_list, is_mode=mode)
train_instance = vis2coco.to_coco(anno_path, img_path)
if not os.path.exists(os.path.join(save_path, "Anno")):
os.makedirs(os.path.join(save_path, "Anno"))
vis2coco.save_coco_json(train_instance,
os.path.join(save_path, 'Anno', 'VisDrone2019-DET_{}_coco.json'.format(mode)))
print('Process {} Done'.format(mode))
if __name__=="__main__":
# examples_write_json()
root_path = '/home/aistudio/work/data/'
category_list = ['pedestrain', 'people', 'bicycle', 'car', 'van','truck', 'tricycle', 'awning-tricycle', 'bus', 'motor']
for mode in ['train', 'val']:
cvt_vis2coco(os.path.join(root_path, 'VisDrone2019-DET-{}/images'.format(mode)),
os.path.join(root_path, 'VisDrone2019-DET-{}/annotations'.format(mode)),
root_path, category_list=category_list, mode=mode) # mode: train or val
4. 环境准备
PicoDet需要使用PaddleDetection的2.3及以上版本,因此尽量使用最新版本,本项目中使用的是2.3版本。
#项目中已有,无需执行
#!cd work && git clone https://gitee.com/paddlepaddle/PaddleDetection.git
# 首先切换工作目录
import os
os.chdir("/home/aistudio/work/PaddleDetection")
!pwd
/home/aistudio/work/PaddleDetection
# 安装库文件
!pip install -r requirements.txt
4.1 修改模型配置文件
修改configs/picodet/picodet_l_640_coco.yml中的batch_size为8,base_lr为0.03,官方默认是8卡训练,这里单卡就需要减少1/8,为了方便直接取了1/10,文件如下:
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/picodet_esnet.yml',
'_base_/optimizer_300e.yml',
'_base_/picodet_640_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
weights: output/picodet_l_640_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
epoch: 250
ESNet:
scale: 1.25
feature_maps: [4, 11, 14]
act: hard_swish
channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
CSPPAN:
out_channels: 160
PicoHead:
conv_feat:
name: PicoFeat
feat_in: 160
feat_out: 160
num_convs: 4
num_fpn_stride: 4
norm_type: bn
share_cls_reg: True
feat_in_chan: 160
TrainReader:
batch_size: 8
LearningRate:
base_lr: 0.03
schedulers:
- !CosineDecay
max_epochs: 300
- !LinearWarmup
start_factor: 0.1
steps: 300
修改configs/datasets/coco_detection.yml中的数据集路径及类别数,修改后如下:
metric: COCO
num_classes: 10
TrainDataset:
!COCODataSet
image_dir: VisDrone2019-DET-train/images
anno_path: Anno/VisDrone2019-DET_train_coco.json
dataset_dir: ../data/
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: VisDrone2019-DET-val/images
anno_path: Anno/VisDrone2019-DET_val_coco.json
dataset_dir: ../data/
4.2 开始训练
!python tools/train.py -c configs/picodet/picodet_l_640_coco.yml --eval
项目中已经保存了训练好的模型output/picodet_l_640_coco/,可以直接进行评估,Map@0.5为34.2,Map@0.5:0.95为20.9;
如果要其他分辨率的指标,可以修改configs/picodet/base/picodet_640_reader.yml,line21的target_size;
!python tools/eval.py -c configs/picodet/picodet_l_640_coco.yml \
-o weights=output/picodet_l_640_coco/best_model.pdparams
5. 模型导出
# 导出模型
! python tools/export_model.py -c configs/picodet/picodet_l_640_coco.yml -o weight=output/picodet_l_640_coco/best_model.pdparams TestReader.inputs_def.image_shape=[1,3,640,640] --output_dir inference_model
# 模型推理
!python deploy/python/infer.py --model_dir=./inference_model/picodet_l_640_coco \
--image_file=../data/VisDrone2019-DET-val/images/0000242_02762_d_0000010.jpg \
--device=GPU --threshold=0.2
# 可视化预测图片
import cv2
import matplotlib.pyplot as plt
import numpy as np
image = cv2.imread('output/0000242_02762_d_0000010.jpg')
plt.figure(figsize=(15,10))
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.show()
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-pm2vhSoC-1644589392634)(output_21_0.png)]
6. 模型转ncnn
PaddleDetection提供了基于ncnn的推理部署。
导出onnx模型需要环境如下:
paddle2onnx>=0.7
onnx>=1.10.1
onnx-simplifier>=0.3.6
# 安装依赖环境
!pip install paddle2onnx==0.7 onnx==1.10.1 onnx-simplifier==0.3.6
# 转为onnx模型
!paddle2onnx --model_dir inference_model/picodet_l_640_coco/ \
--model_filename model.pdmodel \
--params_filename model.pdiparams \
--opset_version 11 \
--save_file picodet_l_640_coco.onnx
#简化模型
!python -m onnxsim picodet_l_640_coco.onnx picodet_l_640_coco_sim.onnx
简化模型时可能会报一个failed,不影响最终的使用;
Check failed, please be careful to use the simplified model, or try specifying “–skip-fuse-bn” or “–skip-optimization” (run “python3 -m onnxsim -h” for details)
基于ncnn的后处理及部署,PaddleDetection提供了源文件和详细的操作步骤,参考:
https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/deploy/third_engine/demo_ncnn
7. 总结
7.1 指标对比
Model | Input size | Params | FLOPS | Map@0.5 | Map@0.5:0.95 |
---|---|---|---|---|---|
Picodet-l | 640 | 3.3M | 8.91G | 34.2 | 20.9 |
Picodet-l | 704 | 3.3M | 10.78G | 36.3 | 22.3 |
yolov5-s | 640 | 7.2M | 16.5G | 31.3 | 16.9 |
yolov5-x | 640 | 86.7M | 205.7G | 38.6 | 22.6 |
yolov3-spp | 608 | 63.9M | 151.72G | 23.3 | - |
yolov3-spp | 832 | 63.9M | 284.10G | 26.4 | - |
注:1)yolov5模型为自己本地训练,不方便公开,大家可以自行训练对比;2)yolov3-spp模型参考自SlimYolov3
结果对比:
1)Picodet-l-640,以约一半的参数量和FLOPS,超越了yolov5-s约3-4个点;
2)Picodet-l-704,以远小于yolov5-x的计算量和FLOPS,逼近了Map@0.5:0.95指标;
3)Picodet以极小的参数和FLOPS大幅超越了yolov3-spp的指标;
如此优秀的模型,怎能不用起来!!!
7.2 提升方案
目前方案初步实现了在VisDrone上的检测,还有很多方法可以进一步提升指标,还需要做更多的尝试!
1)采用更大的分辨率训练,VisDrone数据集以小目标为主,而Picodet如此的轻量级,因此在性能允许的情况下可以增大分辨率,从而获取更好的指标;
2)加载Coco的预训练模型,目前加载的仅是backbone的预训练模型,如果加载官方提供的coco的预训练模型,预计会有1-2个点的提升;
3)尝试其他超参数,如学习率,优化器等;
7.3 后续计划
1)给Picodet增加其他模块或tricks,尝试进一步提升指标;
2)部署到硬件平台;
有时间继续更新;
更多推荐
所有评论(0)