1. 项目简介

本项目基于PicoDet实现对无人机视角的VisDrone数据集进行目标检测并实现在ncnn中的推理。

1.1 PicoDet简介

PicoDet是百度新推出的轻量级目标检测网络,对anchor-free策略在轻量型目标检测模型中的应用进行了探索,通过对backbone、neck、标签分配策略以及训练方法等诸多优化,在精度-效率上取得了更好的均衡。

PicoDet-S仅需0.99M参数即可取得30.6%mAP,比YOLOX-Nano高4.8%同时推理延迟降低55%,比NanoDet指标高7.1%;

当输入尺寸为320时,在移动端ARM CPU上可以达到123FPS处理速度,推理框架为PaddleLite时,推理速度可达150FPS。

PicoDet-M仅需2.15M参数即可取得34.3%mAP指标;

PicoDet-L仅需3.3M参数即可取得40.9%mAP,比YOLOv5s高3.7%mAP,推理速度快44%。

与其他轻量级检测网络对比如下:

论文:https://arxiv.org/abs/2111.00902

代码:https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet

1.2 VisDrone数据介绍

配备摄像机的无人机或通用无人机具有广泛的应用,包括农业,航空摄影,快速交付和监视等。

VisDrone数据集是由天津大学等团队开源的一个大型无人机视角的数据集,官方提供的数据中训练集是6471、验证集是548,一共提供了以下11个类,分别是:‘pedestrian’, ‘people’, ‘bicycle’, ‘car’, ‘van’,‘truck’, ‘tricycle’, ‘awning-tricycle’, ‘bus’, ‘motor’, ‘others’,其中others是非有效目标区域,本项目中予以忽略;

数据集主页:http://aiskyeye.com/challenge/object-detection/

数据集标注展示:

2. PicoDet算法介绍

官方的讲解非常详细,PPT参考如下;

2.1 PicoDet的算法特色

2.2 PicoDet的backbone

2.3 PicoDet的算法结构

2.4 SimOTA的特点

2.5 其他优化策略

3. 数据准备

PaddleDetection默认的是coco格式,而VisDrone有自己的标注格式,因此需要转换;

3.1 解压数据集

!mkdir work/data
!unzip -oq data/data115729/VisDrone2019-DET-train.zip -d work/data
!unzip -oq data/data115729/VisDrone2019-DET-val.zip -d work/data

3.2 转换数据格式为Coco格式

import json
import os
import cv2
import numpy as np
from PIL import Image
import shutil

class Vis2COCO:
    def __init__(self, save_path, train_ratio, category_list, is_mode="train"):
        self.category_list = category_list
        self.images = []
        self.annotations = []
        self.categories = []
        self.img_id = 0
        self.ann_id = 0
        self.is_mode = is_mode
        self.train_ratio = train_ratio
        self.save_path = save_path  
        if not os.path.exists(self.save_path):
            os.makedirs(self.save_path)

    def to_coco(self, anno_dir, img_dir):
        self._init_categories()
        img_list = os.listdir(img_dir)
        for img_name in img_list:
            anno_path = os.path.join(anno_dir, img_name.replace(os.path.splitext(img_name)[-1], '.txt'))
            if not os.path.isfile(anno_path):
                print('File is not exist!', anno_path)
                continue

            img_path = os.path.join(img_dir, img_name)
            img = cv2.imread(img_path)
            h, w, c = img.shape
            self.images.append(self._image(img_path, h, w))
            if self.img_id % 500 == 0:
                print("处理到第{}张图片".format(self.img_id))

            with open(anno_path, 'r') as f:
                for lineStr in f.readlines():
                    try:
                        if ',' in lineStr:
                            xmin, ymin, w, h, score, category, trunc, occlusion = lineStr.split(',')
                        else:
                            xmin, ymin, w, h, score, category, trunc, occlusion = lineStr.split()
                    except:
                        # print('error: ', anno_path, 'line: ', lineStr)            
                        continue
                    if int(category) in [0, 11] or int(w) < 4 or int(h) < 4:
                        continue
                    label, bbox = int(category), [int(xmin), int(ymin), int(w), int(h)]
                    annotation = self._annotation(label, bbox)
                    self.annotations.append(annotation)
                    self.ann_id += 1
            self.img_id += 1
        instance = {}
        instance['info'] = 'VisDrone'
        instance['license'] = ['none']
        instance['images'] = self.images
        instance['annotations'] = self.annotations
        instance['categories'] = self.categories
        return instance

    def _init_categories(self):
        cls_num = len(self.category_list)
        for v in range(1, cls_num + 1):
            #print(v)
            category = {}
            category['id'] = v
            category['name'] = self.category_list[v - 1]
            category['supercategory'] = self.category_list[v - 1]
            self.categories.append(category)

    def _image(self, path, h, w):
        image = {}
        image['height'] = h
        image['width'] = w
        image['id'] = self.img_id
        image['file_name'] = os.path.basename(path) 
        return image

    def _annotation(self, label, bbox):
        area = bbox[2] * bbox[3]
        annotation = {}
        annotation['id'] = self.ann_id
        annotation['image_id'] = self.img_id
        annotation['category_id'] = label
        annotation['segmentation'] = []  
        annotation['bbox'] = bbox
        annotation['iscrowd'] = 0
        annotation["ignore"] = 0
        annotation['area'] = area
        return annotation

    def save_coco_json(self, instance, save_path):
        import json
        with open(save_path, 'w') as fp:
            json.dump(instance, fp, indent=4, separators=(',', ': ')) 

def checkPath(path):
    if not os.path.exists(path):
        os.makedirs(path)

def cvt_vis2coco(img_path, anno_path, save_path, train_ratio=0.9, category_list=[], mode='train'):  # mode: train or val
    vis2coco = Vis2COCO(save_path, train_ratio, category_list, is_mode=mode)
    train_instance = vis2coco.to_coco(anno_path, img_path)
    if not os.path.exists(os.path.join(save_path, "Anno")):
        os.makedirs(os.path.join(save_path, "Anno"))
    vis2coco.save_coco_json(train_instance,
                               os.path.join(save_path, 'Anno', 'VisDrone2019-DET_{}_coco.json'.format(mode)))
    print('Process {} Done'.format(mode))    

if __name__=="__main__":
    # examples_write_json()
    root_path = '/home/aistudio/work/data/'
    category_list = ['pedestrain', 'people', 'bicycle', 'car', 'van','truck', 'tricycle', 'awning-tricycle', 'bus', 'motor']
    for mode in ['train', 'val']:
        cvt_vis2coco(os.path.join(root_path, 'VisDrone2019-DET-{}/images'.format(mode)), 
                    os.path.join(root_path, 'VisDrone2019-DET-{}/annotations'.format(mode)),
                    root_path, category_list=category_list, mode=mode)      # mode: train or val

4. 环境准备

PicoDet需要使用PaddleDetection的2.3及以上版本,因此尽量使用最新版本,本项目中使用的是2.3版本。

#项目中已有,无需执行
#!cd work && git clone https://gitee.com/paddlepaddle/PaddleDetection.git
# 首先切换工作目录
import os
os.chdir("/home/aistudio/work/PaddleDetection")
!pwd
/home/aistudio/work/PaddleDetection
# 安装库文件
!pip install -r requirements.txt

4.1 修改模型配置文件

修改configs/picodet/picodet_l_640_coco.yml中的batch_size为8,base_lr为0.03,官方默认是8卡训练,这里单卡就需要减少1/8,为了方便直接取了1/10,文件如下:

_BASE_: [
  '../datasets/coco_detection.yml',
  '../runtime.yml',
  '_base_/picodet_esnet.yml',
  '_base_/optimizer_300e.yml',
  '_base_/picodet_640_reader.yml',
]

pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
weights: output/picodet_l_640_coco/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 40
snapshot_epoch: 10
epoch: 250

ESNet:
  scale: 1.25
  feature_maps: [4, 11, 14]
  act: hard_swish
  channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]

CSPPAN:
  out_channels: 160

PicoHead:
  conv_feat:
    name: PicoFeat
    feat_in: 160
    feat_out: 160
    num_convs: 4
    num_fpn_stride: 4
    norm_type: bn
    share_cls_reg: True
  feat_in_chan: 160

TrainReader:
  batch_size: 8

LearningRate:
  base_lr: 0.03
  schedulers:
  - !CosineDecay
    max_epochs: 300
  - !LinearWarmup
    start_factor: 0.1
    steps: 300

修改configs/datasets/coco_detection.yml中的数据集路径及类别数,修改后如下:

metric: COCO
num_classes: 10

TrainDataset:
  !COCODataSet
    image_dir: VisDrone2019-DET-train/images
    anno_path: Anno/VisDrone2019-DET_train_coco.json
    dataset_dir: ../data/
    data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']

EvalDataset:
  !COCODataSet
    image_dir: VisDrone2019-DET-val/images
    anno_path: Anno/VisDrone2019-DET_val_coco.json
    dataset_dir: ../data/

4.2 开始训练

!python tools/train.py -c configs/picodet/picodet_l_640_coco.yml --eval

项目中已经保存了训练好的模型output/picodet_l_640_coco/,可以直接进行评估,Map@0.5为34.2,Map@0.5:0.95为20.9;

如果要其他分辨率的指标,可以修改configs/picodet/base/picodet_640_reader.yml,line21的target_size;

!python tools/eval.py -c configs/picodet/picodet_l_640_coco.yml \
                    -o weights=output/picodet_l_640_coco/best_model.pdparams

5. 模型导出

# 导出模型
! python tools/export_model.py -c configs/picodet/picodet_l_640_coco.yml -o weight=output/picodet_l_640_coco/best_model.pdparams TestReader.inputs_def.image_shape=[1,3,640,640] --output_dir inference_model
# 模型推理
!python deploy/python/infer.py --model_dir=./inference_model/picodet_l_640_coco \
            --image_file=../data/VisDrone2019-DET-val/images/0000242_02762_d_0000010.jpg \
            --device=GPU --threshold=0.2
# 可视化预测图片

import cv2
import matplotlib.pyplot as plt
import numpy as np

image = cv2.imread('output/0000242_02762_d_0000010.jpg')
plt.figure(figsize=(15,10))
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-pm2vhSoC-1644589392634)(output_21_0.png)]

6. 模型转ncnn

PaddleDetection提供了基于ncnn的推理部署。

导出onnx模型需要环境如下:

paddle2onnx>=0.7
onnx>=1.10.1
onnx-simplifier>=0.3.6

# 安装依赖环境
!pip install paddle2onnx==0.7 onnx==1.10.1 onnx-simplifier==0.3.6
# 转为onnx模型
!paddle2onnx --model_dir inference_model/picodet_l_640_coco/ \
            --model_filename model.pdmodel \
            --params_filename model.pdiparams \
            --opset_version 11 \
            --save_file picodet_l_640_coco.onnx
#简化模型
!python -m onnxsim picodet_l_640_coco.onnx picodet_l_640_coco_sim.onnx

简化模型时可能会报一个failed,不影响最终的使用;
Check failed, please be careful to use the simplified model, or try specifying “–skip-fuse-bn” or “–skip-optimization” (run “python3 -m onnxsim -h” for details)

基于ncnn的后处理及部署,PaddleDetection提供了源文件和详细的操作步骤,参考:
https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/deploy/third_engine/demo_ncnn

7. 总结

7.1 指标对比

ModelInput sizeParamsFLOPSMap@0.5Map@0.5:0.95
Picodet-l6403.3M8.91G34.220.9
Picodet-l7043.3M10.78G36.322.3
yolov5-s6407.2M16.5G31.316.9
yolov5-x64086.7M205.7G38.622.6
yolov3-spp60863.9M151.72G23.3-
yolov3-spp83263.9M284.10G26.4-

注:1)yolov5模型为自己本地训练,不方便公开,大家可以自行训练对比;2)yolov3-spp模型参考自SlimYolov3

结果对比:

1)Picodet-l-640,以约一半的参数量和FLOPS,超越了yolov5-s约3-4个点;

2)Picodet-l-704,以远小于yolov5-x的计算量和FLOPS,逼近了Map@0.5:0.95指标;

3)Picodet以极小的参数和FLOPS大幅超越了yolov3-spp的指标;

如此优秀的模型,怎能不用起来!!!

7.2 提升方案

目前方案初步实现了在VisDrone上的检测,还有很多方法可以进一步提升指标,还需要做更多的尝试!

1)采用更大的分辨率训练,VisDrone数据集以小目标为主,而Picodet如此的轻量级,因此在性能允许的情况下可以增大分辨率,从而获取更好的指标;

2)加载Coco的预训练模型,目前加载的仅是backbone的预训练模型,如果加载官方提供的coco的预训练模型,预计会有1-2个点的提升;

3)尝试其他超参数,如学习率,优化器等;

7.3 后续计划

1)给Picodet增加其他模块或tricks,尝试进一步提升指标;

2)部署到硬件平台;

有时间继续更新;

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐