转自AI Studio,原文链接:https://aistudio.baidu.com/aistudio/projectdetail/3742987

1. 简介

PaddleDetection最近更新了最新的自研检测算法ppyoloe,在速度和精度上取得了更好的平衡,效果炸裂,得到了很多关注,也有了很多解读和应用。

github: https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/ppyoloe

paper: https://arxiv.org/abs/2203.16250 

除了在coco数据集上的对比之外,官方还公布了在VisDrone数据集上与其他优秀算法的指标对比,取得了大幅领先。

不过在这个数据集上,官方尚未公布具体的配置文件和模型,而且三个算法都具有smlx这4个尺度的结构,官方也未说明是上述指标是基于哪个尺度的结构训练而得。

本项目基于ppyoloe实现在VisDrone上的应用,通过训练和测试,走通整个流程,基于s模型验证指标。如果算力允许则欢迎大家复现官方的指标。最后导出onnx模型,用于后续部署。

2. 数据集介绍

VisDrone数据集是由天津大学等团队开源的一个大型无人机视角的数据集,官方提供的数据中训练集是6471、验证集是548,一共提供了以下11个类,分别是:'pedestrian', 'people', 'bicycle', 'car', 'van','truck', 'tricycle', 'awning-tricycle', 'bus', 'motor', 'others',其中others是非有效目标区域,本项目中予以忽略;

数据标注展示:

数据分布统计,参考

数据集主页:http://aiskyeye.com/challenge/object-detection/

aistudio:https://aistudio.baidu.com/aistudio/datasetdetail/115729

3. ppyoloe介绍

ppyoloe是基于ppyolo做的一系列改进和升级,是单阶段Anchor-free模型,超越了多种流行的yolo模型,取得了最新的SOTA。 ppyoloe有一系列的模型,即s/m/l/x,可以通过width multiplier和depth multiplier配置。同时避免使用诸如deformable convolution或者matrix nms之类的特殊算子,以使其能轻松地部署在多种多样的硬件上,因此对部署非常友好。

ppyoloe由以下方法组成:

1)可扩展的backbone和neck

2)Task Alignment Learning

3)Efficient Task-aligned head with DFL和VFL

4)SiLU激活函数

整体结构如下(s尺度模型):

参考,侵删。

3.1 Backbone

基础网络采用自研的CSPRepResNet结构,主要是在ResNet的基础上,参考CSPNet和RepVGG对其进行改进。

CSPNet采用两个分支实现了特征的跨阶段融合,通过将梯度的变化从头到尾地集成到特征图中,大幅降低计算量的同时可以保证准确率。

RepVGG结构,在VGG的基础上面进行改进,主要的思路包括:

在VGG网络的Block块中加入了Identity和残差分支,相当于把ResNet网络中的精华应用 到VGG网络中;

模型推理阶段,通过Op融合策略将所有的网络层都转换为3×3卷积,便于网络的部署和加速。

3.2 Head

Head部分采用了TOOD的思想,也就是T-Head,主要是包括了Cls Head和Loc Head。具体来说,T-head首先在FPN特征基础上进行分类与定位预测;然后TAL基于所提任务对齐测度计算任务对齐信息;最后T-head根据从TAL传回的信息自动调整分类概率与定位预测。 

由于分类和回归这两个任务的预测都是基于这个交互特征来完成的,但是两个任务对于特征的需求肯定是不一样的,因此作者设计了一个layer attention来为每个任务单独的调整一下特征,这个部分的结构也很简单,可以理解为是一个channel-wise的注意力机制。这样的话就得到了对于每个任务单独的特征,然后再利用这些特征生成所需要的类别或者定位的特征图。下面是检测头的原理图和网络结构图。

 

3.3 样本匹配

匹配策略选用了ATSS和TAL。

ATSS论文指出One-Stage Anchor-Based和Center-Based Anchor-Free检测算法间的差异主要来自于正负样本的选择,基于此提出ATSS(Adaptive Training Sample Selection)方法,该方法能够自动根据GT的相关统计特征选择合适的Anchor Box作为正样本,在不带来额外计算量和参数的情况下,能够大幅提升模型的性能。

TOOD提出了Task Alignment Learning (TAL) 来显式的把2个任务的最优Anchor拉近。这是通过设计一个样本分配策略和任务对齐loss来实现的。样本分配计算每个Anchor的任务对齐度,同时任务对齐loss可以逐步将分类和定位的最佳Anchor统一起来。

4. 数据准备

PaddleDetection默认的是coco格式,而VisDrone有其特殊的标注格式,因此需要对数据进行转换,满足PaddleDetection的需求。

4.1 解压数据集

In [1]

!mkdir work/data
!unzip -oq data/data115729/VisDrone2019-DET-train.zip -d work/data
!unzip -oq data/data115729/VisDrone2019-DET-val.zip -d work/data

4.2 转换数据为coco格式

In [2]

import json
import os
import cv2
import numpy as np
from PIL import Image
import shutil

class Vis2COCO:
    def __init__(self, category_list, is_mode="train"):
        self.category_list = category_list
        self.images = []
        self.annotations = []
        self.categories = []
        self.img_id = 0
        self.ann_id = 0
        self.is_mode = is_mode

    def to_coco(self, anno_dir, img_dir):
        self._init_categories()
        img_list = os.listdir(img_dir)
        for img_name in img_list:
            anno_path = os.path.join(anno_dir, img_name.replace(os.path.splitext(img_name)[-1], '.txt'))
            if not os.path.isfile(anno_path):
                print('File is not exist!', anno_path)
                continue

            img_path = os.path.join(img_dir, img_name)
            img = cv2.imread(img_path)
            h, w, c = img.shape
            self.images.append(self._image(img_path, h, w))
            if self.img_id % 500 == 0:
                print("处理到第{}张图片".format(self.img_id))

            with open(anno_path, 'r') as f:
                for lineStr in f.readlines():
                    try:
                        if ',' in lineStr:
                            xmin, ymin, w, h, score, category, trunc, occlusion = lineStr.split(',')
                        else:
                            xmin, ymin, w, h, score, category, trunc, occlusion = lineStr.split()
                    except:
                        # print('error: ', anno_path, 'line: ', lineStr)            
                        continue
                    if int(category) in [0, 11] or int(w) < 4 or int(h) < 4:
                        continue
                    label, bbox = int(category), [int(xmin), int(ymin), int(w), int(h)]
                    annotation = self._annotation(label, bbox)
                    self.annotations.append(annotation)
                    self.ann_id += 1
            self.img_id += 1
        instance = {}
        instance['info'] = 'VisDrone'
        instance['license'] = ['none']
        instance['images'] = self.images
        instance['annotations'] = self.annotations
        instance['categories'] = self.categories
        return instance

    def _init_categories(self):
        cls_num = len(self.category_list)
        for v in range(1, cls_num + 1):
            #print(v)
            category = {}
            category['id'] = v
            category['name'] = self.category_list[v - 1]
            category['supercategory'] = self.category_list[v - 1]
            self.categories.append(category)

    def _image(self, path, h, w):
        image = {}
        image['height'] = h
        image['width'] = w
        image['id'] = self.img_id
        image['file_name'] = os.path.basename(path) 
        return image

    def _annotation(self, label, bbox):
        area = bbox[2] * bbox[3]
        annotation = {}
        annotation['id'] = self.ann_id
        annotation['image_id'] = self.img_id
        annotation['category_id'] = label
        annotation['segmentation'] = []  
        annotation['bbox'] = bbox
        annotation['iscrowd'] = 0
        annotation["ignore"] = 0
        annotation['area'] = area
        return annotation

    def save_coco_json(self, instance, save_path):
        import json
        with open(save_path, 'w') as fp:
            json.dump(instance, fp, indent=4, separators=(',', ': ')) 

def checkPath(path):
    if not os.path.exists(path):
        os.makedirs(path)

def cvt_vis2coco(img_path, anno_path, save_path, train_ratio=0.9, category_list=[], mode='train'):  # mode: train or val
    vis2coco = Vis2COCO( category_list, is_mode=mode)
    instance = vis2coco.to_coco(anno_path, img_path)
    if not os.path.exists(os.path.join(save_path, "Anno")):
        os.makedirs(os.path.join(save_path, "Anno"))
    vis2coco.save_coco_json(instance,
                               os.path.join(save_path, 'Anno', 'instances_{}2017.json'.format(mode)))
    print('Process {} Done'.format(mode))    

if __name__=="__main__":
    # examples_write_json()
    root_path = '/home/aistudio/work/data/'
    category_list = ['pedestrain', 'people', 'bicycle', 'car', 'van','truck', 'tricycle', 'awning-tricycle', 'bus', 'motor']
    for mode in ['train', 'val']:
        cvt_vis2coco(os.path.join(root_path, 'VisDrone2019-DET-{}/images'.format(mode)), 
                    os.path.join(root_path, 'VisDrone2019-DET-{}/annotations'.format(mode)),
                    root_path, category_list=category_list, mode=mode)      # mode: train or val
处理到第0张图片
处理到第500张图片
处理到第1000张图片
处理到第1500张图片
处理到第2000张图片
处理到第2500张图片
处理到第3000张图片
处理到第3500张图片
处理到第4000张图片
处理到第4500张图片
处理到第5000张图片
处理到第5500张图片
处理到第6000张图片
Process train Done
处理到第0张图片
处理到第500张图片
Process val Done

5. 模型训练

5.1 环境准备

ppyoloe需要使用PaddleDetection的2.4及以上版本,因此尽量使用最新版本,本项目中使用的是2.4版本。

In [ ]

#项目中已有,无需执行
#!cd work && git clone https://gitee.com/paddlepaddle/PaddleDetection.git

In [3]


# 首先切换工作目录
import os
os.chdir("/home/aistudio/work/PaddleDetection-release-2.4")
!pwd
/home/aistudio/work/PaddleDetection-release-2.4

In [ ]

# 安装库文件
!pip install -r requirements.txt

5.2 修改模型配置文件

修改configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml中的batch_size为16,base_lr为0.04,官方默认是8卡训练,这里单卡就需要减少1/8,文件如下:

修改数据集路径、类别数:

调整多尺度训练的范围,由于VISDrone数据集以小目标居多,因此适当去掉过小的尺度,并且去掉数据增强中的Expand,避免输入的目标过小。

5.3 模型训练评估

In [ ]

!python tools/train.py -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml --eval --amp

In [6]

# 模型评估
!python tools/eval.py -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml \
                    -o weights=output/ppyoloe_crn_s_300e_coco/best_model.pdparams
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W0421 14:24:09.454457 12973 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W0421 14:24:09.459218 12973 device_context.cc:465] device: 0, cuDNN Version: 7.6.
loading annotations into memory...
Done (t=0.20s)
creating index...
index created!
[04/21 14:24:13] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_s_300e_coco/best_model.pdparams
[04/21 14:24:13] ppdet.engine INFO: Eval iter: 0
[04/21 14:24:22] ppdet.engine INFO: Eval iter: 100
[04/21 14:24:31] ppdet.engine INFO: Eval iter: 200
[04/21 14:24:39] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json.
loading annotations into memory...
Done (t=0.28s)
creating index...
index created!
[04/21 14:24:40] ppdet.metrics.coco_utils INFO: Start evaluate...
Loading and preparing results...
DONE (t=0.54s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=29.75s).
Accumulating evaluation results...
DONE (t=0.94s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.246
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.396
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.255
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.147
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.373
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.525
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.110
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.277
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.333
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.225
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.488
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.635
[04/21 14:25:11] ppdet.engine INFO: Total sample number: 548, averge FPS: 21.898353005576705

可以得到指标AP为24.6和AP50为39.6。

6. 模型导出推理

训练后的模型,可以导出为静态图模型进行推理,也可以导出为onnx模型用于后续的部署。

In [17]

# 导出模型
! python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml -o weight=output/ppyoloe_crn_s_300e_coco/best_model.pdparams  \
                                                   TestReader.inputs_def.image_shape=[1,3,640,640] --output_dir output_inference
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/tensor/creation.py:130: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  if data.dtype == np.object:
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
[04/21 14:33:44] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_s_300e_coco/model_final.pdparams
[04/21 14:33:44] ppdet.engine INFO: Export inference config file to output_inference/ppyoloe_crn_s_300e_coco/infer_cfg.yml
W0421 14:33:47.490147 14246 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W0421 14:33:47.490203 14246 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[04/21 14:33:49] ppdet.engine INFO: Export model and saved in output_inference/ppyoloe_crn_s_300e_coco

In [18]

# 模型推理
!python deploy/python/infer.py --model_dir=./output_inference/ppyoloe_crn_s_300e_coco \
            --image_file=../data/VisDrone2019-DET-val/images/0000242_02762_d_0000010.jpg \
            --device=GPU --threshold=0.2
-----------  Running Arguments -----------
action_file: None
batch_size: 1
camera_id: -1
cpu_threads: 1
device: GPU
enable_mkldnn: False
enable_mkldnn_bfloat16: False
image_dir: None
image_file: ../data/VisDrone2019-DET-val/images/0000242_02762_d_0000010.jpg
model_dir: ./output_inference/ppyoloe_crn_s_300e_coco
output_dir: output
random_pad: False
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: paddle
save_images: False
save_mot_txt_per_img: False
save_mot_txts: False
scaled: False
threshold: 0.2
tracker_config: None
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: False
video_file: None
window_size: 50
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order: 
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
class_id:0, confidence:0.7403, left_top:[682.40,369.72],right_bottom:[698.77,398.33]
class_id:0, confidence:0.6031, left_top:[255.11,32.74],right_bottom:[262.35,50.27]
class_id:0, confidence:0.4188, left_top:[216.22,23.99],right_bottom:[223.42,39.18]
class_id:0, confidence:0.3969, left_top:[564.11,186.43],right_bottom:[572.30,208.88]
class_id:0, confidence:0.3357, left_top:[590.26,228.64],right_bottom:[600.33,252.85]
class_id:0, confidence:0.3056, left_top:[70.37,48.15],right_bottom:[79.97,66.41]
class_id:0, confidence:0.2991, left_top:[74.34,57.03],right_bottom:[84.45,71.96]
class_id:0, confidence:0.2606, left_top:[576.65,198.28],right_bottom:[585.06,215.03]
class_id:0, confidence:0.2493, left_top:[208.18,24.50],right_bottom:[215.50,39.35]
class_id:0, confidence:0.2343, left_top:[111.87,24.94],right_bottom:[119.99,41.12]
class_id:0, confidence:0.2270, left_top:[254.29,24.59],right_bottom:[262.31,41.51]
class_id:0, confidence:0.2220, left_top:[274.13,47.58],right_bottom:[281.30,62.75]
class_id:0, confidence:0.2172, left_top:[264.27,9.80],right_bottom:[271.67,31.43]
class_id:0, confidence:0.2059, left_top:[687.93,372.49],right_bottom:[703.26,399.52]
class_id:0, confidence:0.2033, left_top:[298.50,41.20],right_bottom:[305.56,54.85]
class_id:1, confidence:0.5634, left_top:[585.51,417.50],right_bottom:[600.01,440.17]
class_id:1, confidence:0.4150, left_top:[407.36,272.13],right_bottom:[417.78,288.92]
class_id:1, confidence:0.3345, left_top:[406.67,274.06],right_bottom:[418.35,296.44]
class_id:1, confidence:0.2821, left_top:[216.53,23.41],right_bottom:[223.62,39.10]
class_id:1, confidence:0.2661, left_top:[208.18,24.50],right_bottom:[215.50,39.35]
class_id:1, confidence:0.2176, left_top:[273.83,44.93],right_bottom:[280.81,61.28]
class_id:2, confidence:0.8019, left_top:[584.73,418.97],right_bottom:[601.61,456.52]
class_id:2, confidence:0.2349, left_top:[576.65,198.28],right_bottom:[585.06,215.03]
class_id:2, confidence:0.2070, left_top:[585.50,416.89],right_bottom:[601.09,443.51]
class_id:3, confidence:0.9132, left_top:[335.51,138.43],right_bottom:[365.33,177.18]
class_id:3, confidence:0.9130, left_top:[273.80,196.21],right_bottom:[308.28,242.25]
class_id:3, confidence:0.9113, left_top:[383.26,153.87],right_bottom:[416.04,186.39]
class_id:3, confidence:0.9113, left_top:[434.67,147.58],right_bottom:[461.17,182.30]
class_id:3, confidence:0.9048, left_top:[83.74,105.93],right_bottom:[135.14,128.26]
class_id:3, confidence:0.9012, left_top:[152.19,85.96],right_bottom:[202.97,108.53]
class_id:3, confidence:0.8995, left_top:[259.91,303.45],right_bottom:[311.02,380.26]
class_id:3, confidence:0.8955, left_top:[328.47,61.97],right_bottom:[351.79,87.31]
class_id:3, confidence:0.8884, left_top:[466.01,251.52],right_bottom:[500.05,303.08]
class_id:3, confidence:0.8868, left_top:[463.36,120.60],right_bottom:[487.01,152.71]
class_id:3, confidence:0.8854, left_top:[359.39,64.99],right_bottom:[382.56,94.97]
class_id:3, confidence:0.8849, left_top:[728.18,60.82],right_bottom:[770.74,81.78]
class_id:3, confidence:0.8831, left_top:[42.40,182.54],right_bottom:[106.15,210.59]
class_id:3, confidence:0.8720, left_top:[0.69,160.30],right_bottom:[66.17,186.80]
class_id:3, confidence:0.8567, left_top:[457.52,76.23],right_bottom:[478.80,99.52]
class_id:3, confidence:0.8562, left_top:[453.73,40.99],right_bottom:[472.34,61.64]
class_id:3, confidence:0.8518, left_top:[460.27,489.59],right_bottom:[511.61,540.22]
class_id:3, confidence:0.8487, left_top:[802.61,53.41],right_bottom:[830.56,76.80]
class_id:3, confidence:0.8396, left_top:[771.94,89.00],right_bottom:[800.47,114.64]
class_id:3, confidence:0.7893, left_top:[659.04,66.23],right_bottom:[693.35,87.54]
class_id:3, confidence:0.6776, left_top:[720.33,89.13],right_bottom:[755.89,111.78]
class_id:3, confidence:0.3606, left_top:[291.12,516.98],right_bottom:[337.17,539.94]
class_id:3, confidence:0.2917, left_top:[666.79,53.74],right_bottom:[695.98,74.75]
class_id:3, confidence:0.2435, left_top:[2.73,116.84],right_bottom:[47.41,139.27]
class_id:3, confidence:0.2016, left_top:[-0.03,5.90],right_bottom:[33.60,26.54]
class_id:6, confidence:0.7651, left_top:[101.71,155.20],right_bottom:[146.12,176.29]
class_id:7, confidence:0.2794, left_top:[101.71,155.20],right_bottom:[146.12,176.29]
class_id:9, confidence:0.7107, left_top:[406.48,275.23],right_bottom:[418.47,299.63]
class_id:9, confidence:0.2176, left_top:[208.19,27.13],right_bottom:[215.17,40.10]
save result to: output/0000242_02762_d_0000010.jpg
Test iter 0
------------------ Inference Time Info ----------------------
total_time(ms): 1738.3, img_num: 1
average latency time(ms): 1738.30, QPS: 0.575275
preprocess_time(ms): 1714.70, inference_time(ms): 23.50, postprocess_time(ms): 0.10

In [19]

# 可视化预测图片
import cv2
import matplotlib.pyplot as plt
import numpy as np

image = cv2.imread('output/0000242_02762_d_0000010.jpg')
plt.figure(figsize=(15,10))
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.show()

<Figure size 1080x720 with 1 Axes>

将模型导出为onnx模型,用于后续部署到其他硬件平台。

PaddleDetection提供了基于ncnn的推理部署。

导出onnx模型需要环境如下:

paddle2onnx>=0.7 onnx>=1.10.1 onnx-simplifier>=0.3.6

In [ ]

# 安装依赖环境
!pip install paddle2onnx==0.7 onnx==1.10.1 onnx-simplifier==0.3.6

In [20]

# 重新导出模型为不带nms等后处理的模型
! python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml -o weight=output/ppyoloe_crn_s_300e_coco/best_model.pdparams  exclude_nms=True\
                                                   TestReader.inputs_def.image_shape=[1,3,640,640] --output_dir output_inference2
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/tensor/creation.py:130: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  if data.dtype == np.object:
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
[04/21 14:35:39] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_s_300e_coco/model_final.pdparams
[04/21 14:35:40] ppdet.engine INFO: Export inference config file to output_inference2/ppyoloe_crn_s_300e_coco/infer_cfg.yml
W0421 14:35:43.050825 14466 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W0421 14:35:43.050880 14466 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[04/21 14:35:45] ppdet.engine INFO: Export model and saved in output_inference2/ppyoloe_crn_s_300e_coco

In [21]

# 转为onnx模型
!paddle2onnx --model_dir output_inference2/ppyoloe_crn_s_300e_coco/ \
            --model_filename model.pdmodel \
            --params_filename model.pdiparams \
            --opset_version 11 \
            --save_file ppyoloe_crn_s_300e_coco.onnx
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle2onnx/onnx_helper/mapping.py:42: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  int(TensorProto.STRING): np.dtype(np.object)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:43: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  np.bool: core.VarDesc.VarType.BOOL,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:44: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  core.VarDesc.VarType.FP32: np.float,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:49: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  core.VarDesc.VarType.BOOL: np.bool
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle2onnx/onnx_helper/helper.py:235: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  is_iterable = isinstance(value, collections.Iterable)
2022-04-21 14:36:00 [INFO]	ONNX model saved in ppyoloe_crn_s_300e_coco.onnx

In [22]

#简化模型
!python -m onnxsim ppyoloe_crn_s_300e_coco.onnx ppyoloe_crn_s_300e_coco_sim.onnx
Simplifying...
Checking 0/3...
Tensor sum_0.tmp_0 changes after simplifying. The max diff is 0.0205078125.
Note that the checking is not always correct.
After simplifying:
[[[-4626.6465   -253.13628  5128.2534    318.52835]
  [-3947.9695   -251.94598  5910.2114    320.61066]
  [-3481.4626   -259.7044   7072.6196    324.95926]
  ...
  [13024.912     719.8111  14136.699     733.54614]
  [13158.68      715.71814 14143.302     734.0128 ]
  [13105.94      709.787   14139.707     735.0507 ]]]
Before simplifying:
[[[-4626.646    -253.1363   5128.2544    318.5283 ]
  [-3947.9688   -251.946    5910.213     320.61063]
  [-3481.4617   -259.70438  7072.6196    324.95926]
  ...
  [13024.912     719.8111  14136.699     733.54614]
  [13158.68      715.71814 14143.302     734.0128 ]
  [13105.94      709.78705 14139.708     735.0507 ]]]
----------------
Check failed, please be careful to use the simplified model, or try specifying "--skip-fuse-bn" or "--skip-optimization" (run "python3 -m onnxsim -h" for details)

简化模型时可能会报一个failed,不影响最终的使用; Check failed, please be careful to use the simplified model, or try specifying "--skip-fuse-bn" or "--skip-optimization" (run "python3 -m onnxsim -h" for details)

其他部署可以参考官方提供的工程

7. 对比总结

7.1 指标对比:

目前工程中提供的是训练了200epoch的模型。评估指标AP为24.6,AP50为39.6。结合项目开头介绍的PaddleDetection官方给出的指标,已经优于Yolox,略低于Yolov5。不过此处使用的是ppyoloe的s模型,属于最小的,而使用大模型会有更多的提升,估计官方给的指标应该是各个算法比较大的x模型。另外本项目中使用的是10个类别,而上述数据是使用9个类别。

总体来说,s的模型能到这个指标已经是很高了,难掩ppyoloe的优秀。

7.2 提升方案

目前方案以s模型初步实现了在VisDrone上的检测,如果算力允许可以有很多方法可以进一步提升指标,也可以复现官方给出的指标!

1)采用更大的分辨率训练,VisDrone数据集以小目标为主,因此在性能允许的情况下可以增大分辨率,从而获取更好的指标;

2)加载Coco的预训练模型,目前加载的仅是backbone的预训练模型,如果加载官方提供的coco的预训练模型,预计会有1-2个点的提升;

3)尝试其他超参数,如学习率,优化器等;

4)采用更大尺度的模型,可以有更高的指标提升。

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐