★★★ 本文源自AlStudio社区精品项目,【点击此处】查看更多精品内容 >>>

1. 模型介绍

ppyolo是PaddleDetection开源的yolo系列模型PP-YOLO的整体结构图如下:

YOLO检测器分为三个主要部分。

YOLO Backbone:YOLO Backbone(骨干)是一个卷积神经网络,它将图像像素合并在一起以形成不同粒度的特征。骨干通常在分类数据集(通常为ImageNet)上进行预训练。ppyolo使用Resnet50-vd-dcn ConvNet骨干替换YOLOv3 Darknet53骨干,它的执行优化了更多的框架,并且其参数少于Darknet53。通过交换Backbone,ppyolo的mAP的取得一定的提升。

YOLO Neck:ppyoloYOLO采取的是FPN特征金字塔结构做一个特征融合,类似Yolo3,选取最后三个卷积层C3, C4,C5,然后经过FPN结构在传递到预测头之前,将ConvNet图层表示的高层级语义信息和低层级信息进行融合。

YOLO Head:检测头是网络中进行边界框和类预测的部分,它由关于类,框和对象的三个YOLO损失函数指导。原始yolo3的检测头是一个非常简单的结构,通过3x3卷积并最后用1x1卷积调整到自己所需要的通道数目。

在数据增广仅靠mixup的条件下,通过合理的tricks组合,例如使用更大的batchsize192,8GPUs * 24(pre gpu)Matrix NMSCoordConv等方法对yolov3进一步优化和改进,最终在COCOtest-dev2017数据集上精度达到45.9%,在单卡V100上FP32推理速度为72.9 FPS。与其他模型对比如下:

2. PPYOLO数据集准备

2.1 数据说明

原始数据下载链接:http://data.mendeley.com/datasets/7b52hhzs3r/1

该数据集共包含2400张原始图像,其中1200张属于吸烟(吸烟者)类别,其余1200张属于禁烟(非吸烟者)类别。该数据集是通过扫描各种搜索引擎来整理的,输入多个关键词,包括吸烟、吸烟者、人、咳嗽、服用吸入器、打电话的人、饮用水等。为了更好地训练模型,我们试图在两个类别中考虑多功能图像,以产生一定程度的类别间混乱。例如,吸烟类别由多个角度和各种姿势的吸烟者图像组成。此外,不吸烟类别中的图像包含非吸烟者的图像,其手势与吸烟图像稍相似,例如人们喝水、使用吸入器、拿着手机、咬指甲等。未来的研究人员可以使用该数据集提出机器学习算法,用于自动检测和筛查吸烟者,以确保绿色环境并在智能城市中进行监测。

2.2 数据标注

原数据已经对smoking、notsmoking进行了分类,故可使用分类模型完成此次任务。为了体验PaddleDetection的yolo系类模型训练、部署的过程,在本地使用LabelImg对smoking数据进行标注。标注好的数据集已经上传,链接:https://aistudio.baidu.com/aistudio/datasetdetail/198887 ,欢迎下载使用。数据集里居然有树哥!!

2.3 数据处理

写在前: 数据清洗的重要性!!! 在数据预处理的时候发现奇怪的报错,cv2.imread()报错,经过一系列的操作发现有jpg图片实际格式为gif,大无语。最后发现以下图片存在问题,直接删除。
training_data/smoking/smoking_0687.jpg 图片实际格式为gif
validation_data/smoking/smoking_0702.jpg 图片实际格式为gif
training_data/smoking/smoking_0094.jpg 图片实际类别为notsmoking

源数据压缩包格式为RAR,标注数据集压缩包为zip

2.3.1 解压数据集

解压数据集并移动到work目录,构建训练数据集文件

# 安装 unrar package
# !pip install --upgrade pip
# !pip install unrar
# !unzip  /home/aistudio/data/data198887/7b52hhzs3r-1.zip
# !mv smokingVSnotsmoking.rar /home/aistudio/work
# %cd /home/aistudio/work 
# !rar x smokingVSnotsmoking.rar
!pwd
!unzip /home/aistudio/data/data198887/smokingVSnotsmoking_mark.zip -d /home/aistudio/work
import os
import numpy as np
import matplotlib.pyplot as plt
import glob
import shutil
import tqdm
# 创建多级文件路径:./smoke_data/images/train, ./smoke_data/images/val, ./smoke_data/labels/train, ./smoke_data/labels/val
if not os.path.exists('./smoke_data'):
    os.mkdir('./smoke_data')
if not os.path.exists('./smoke_data/images'):
    os.mkdir('./smoke_data/images')
if not os.path.exists('./smoke_data/images/train'):
    os.mkdir('./smoke_data/images/train')
if not os.path.exists('./smoke_data/images/val'):
    os.mkdir('./smoke_data/images/val')

if not os.path.exists('./smoke_data/labels'):
    os.mkdir('./smoke_data/labels')
if not os.path.exists('./smoke_data/labels/train'):
    os.mkdir('./smoke_data/labels/train')
if not os.path.exists('./smoke_data/labels/val'):
    os.mkdir('./smoke_data1/labels/val')
# 检查文件结构
! tree ./smoke_data1
./smoke_data1
├── images
│   ├── train
│   └── val
└── labels
    ├── train
    └── val

6 directories, 0 files
%cd /home/aistudio/work/
!cp -rf ./dataset/training_data/smoking/* ./smoke_data/images/train
!cp ./dataset/training_data/lable/smoking* ./smoke_data/labels/train
!cp -rf ./dataset/validation_data/smoking/* ./smoke_data/images/val
!cp ./dataset/validation_data/lable/smoking* ./smoke_data/labels/val
print('train_img_file_names:', len(os.listdir('./smoke_data/images/train')))
print('train_label_file_names:', len(os.listdir('./smoke_data/labels/train')))
print('val_img_file_names:', len(os.listdir('./smoke_data/images/val')))
print('val_label_file_names:', len(os.listdir('./smoke_data/labels/val')))  
/home/aistudio/work
train_img_file_names: 803
train_label_file_names: 803
val_img_file_names: 200
val_label_file_names: 200
2.3.2 yolo数据集格式转voc格式

标注的时候按照yolo数据集格式进行标注的,训练的时候发现只有coco、voc两种格式,只能转一下喽!!!

from xml.dom.minidom import Document
import os
import cv2
 
# def makexml(txtPath, xmlPath, picPath):  # txt所在文件夹路径,xml文件保存路径,图片所在文件夹路径
def makexml(picPath, txtPath, xmlPath):  # txt所在文件夹路径,xml文件保存路径,图片所在文件夹路径
    """此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件
    在自己的标注图片文件夹下建三个子文件夹,分别命名为picture、txt、xml
    """
    # font_lib 这里的字库格式可以根据你的实际格式来读取匹配
    # font_lib = open("./label_list.txt", "r", encoding="utf8").read().split("\n")
    dic = {'0': "smoking",  # 创建字典用来对类型进行转换
        #    '1': "数据集标签类别",  # 此处的字典要与自己的classes.txt文件中的类对应,且顺序要一致
        }

    files = os.listdir(txtPath)
    for i, name in enumerate(files):
        xmlBuilder = Document()
        annotation = xmlBuilder.createElement("annotation")  # 创建annotation标签
        xmlBuilder.appendChild(annotation)
        txtFile = open(txtPath + name)
        txtList = txtFile.readlines()
        img = cv2.imread(picPath + name[0:-4] + ".jpg")
        # print(picPath + name[0:-4] + ".jpg")
        Pheight, Pwidth, Pdepth = img.shape
 
        folder = xmlBuilder.createElement("folder")  # folder标签
        foldercontent = xmlBuilder.createTextNode("smoking")
        folder.appendChild(foldercontent)
        annotation.appendChild(folder)  # folder标签结束
 
        filename = xmlBuilder.createElement("filename")  # filename标签
        filenamecontent = xmlBuilder.createTextNode(name[0:-4] + ".jpg")
        filename.appendChild(filenamecontent)
        annotation.appendChild(filename)  # filename标签结束

        folder = xmlBuilder.createElement("path")  # path标签 
        foldercontent = xmlBuilder.createTextNode(picPath + name[0:-4] + ".jpg")
        folder.appendChild(foldercontent)
        annotation.appendChild(folder)  # path标签结束

        size = xmlBuilder.createElement("size")  # size标签
        width = xmlBuilder.createElement("width")  # size子标签width
        widthcontent = xmlBuilder.createTextNode(str(Pwidth))
        width.appendChild(widthcontent)
        size.appendChild(width)  # size子标签width结束
 
        height = xmlBuilder.createElement("height")  # size子标签height
        heightcontent = xmlBuilder.createTextNode(str(Pheight))
        height.appendChild(heightcontent)
        size.appendChild(height)  # size子标签height结束
 
        depth = xmlBuilder.createElement("depth")  # size子标签depth
        depthcontent = xmlBuilder.createTextNode(str(Pdepth))
        depth.appendChild(depthcontent)
        size.appendChild(depth)  # size子标签depth结束
 
        annotation.appendChild(size)  # size标签结束
 
        for j in txtList:
            oneline = j.strip().split(" ")
            object = xmlBuilder.createElement("object")  # object 标签
            picname = xmlBuilder.createElement("name")  # name标签
#             print(oneline[0])
            namecontent = xmlBuilder.createTextNode(dic[str((oneline[0]))])
            picname.appendChild(namecontent)
            object.appendChild(picname)  # name标签结束
 
            pose = xmlBuilder.createElement("pose")  # pose标签
            posecontent = xmlBuilder.createTextNode("Unspecified")
            pose.appendChild(posecontent)
            object.appendChild(pose)  # pose标签结束
 
            truncated = xmlBuilder.createElement("truncated")  # truncated标签
            truncatedContent = xmlBuilder.createTextNode("0")
            truncated.appendChild(truncatedContent)
            object.appendChild(truncated)  # truncated标签结束
 
            difficult = xmlBuilder.createElement("difficult")  # difficult标签
            difficultcontent = xmlBuilder.createTextNode("0")
            difficult.appendChild(difficultcontent)
            object.appendChild(difficult)  # difficult标签结束
 
            bndbox = xmlBuilder.createElement("bndbox")  # bndbox标签
            xmin = xmlBuilder.createElement("xmin")  # xmin标签
            mathData = int(((float(oneline[1])) * Pwidth + 1) - (float(oneline[3])) * 0.5 * Pwidth)
            xminContent = xmlBuilder.createTextNode(str(mathData))
            xmin.appendChild(xminContent)
            bndbox.appendChild(xmin)  # xmin标签结束
 
            ymin = xmlBuilder.createElement("ymin")  # ymin标签
            mathData = int(((float(oneline[2])) * Pheight + 1) - (float(oneline[4])) * 0.5 * Pheight)
            yminContent = xmlBuilder.createTextNode(str(mathData))
            ymin.appendChild(yminContent)
            bndbox.appendChild(ymin)  # ymin标签结束
 
            xmax = xmlBuilder.createElement("xmax")  # xmax标签
            mathData = int(((float(oneline[1])) * Pwidth + 1) + (float(oneline[3])) * 0.5 * Pwidth)
            xmaxContent = xmlBuilder.createTextNode(str(mathData))
            xmax.appendChild(xmaxContent)
            bndbox.appendChild(xmax)  # xmax标签结束
 
            ymax = xmlBuilder.createElement("ymax")  # ymax标签
            mathData = int(((float(oneline[2])) * Pheight + 1) + (float(oneline[4])) * 0.5 * Pheight)
            ymaxContent = xmlBuilder.createTextNode(str(mathData))
            ymax.appendChild(ymaxContent)
            bndbox.appendChild(ymax)  # ymax标签结束
 
            object.appendChild(bndbox)  # bndbox标签结束
 
            annotation.appendChild(object)  # object标签结束
 
        f = open(xmlPath + name[0:-4] + ".xml", 'w')
        xmlBuilder.writexml(f, indent='\t', newl='\n', addindent='\t', encoding='utf-8')
        f.close()
 
if __name__ == "__main__":
    if not os.path.exists('/home/aistudio/work/dataset/training_data/annotations'):
        os.makedirs("/home/aistudio/work/dataset/training_data/annotations")
    picPath = "/home/aistudio/work/dataset/training_data/smoking/" # 图片所在文件夹路径,后面的/一定要带上
    txtPath = "/home/aistudio/work/dataset/training_data/lable/"  # 标签txt所在文件夹路径,后面的/一定要带上
    xmlPath = "/home/aistudio/work/dataset/training_data/annotations/"  # xml文件保存路径,后面的/一定要带上
    makexml(picPath, txtPath, xmlPath)
    if not os.path.exists('/home/aistudio/work/dataset/validation_data/annotations'):
        os.makedirs("/home/aistudio/work/dataset/validation_data/annotations")
    picPath = "/home/aistudio/work/dataset/validation_data/smoking/" # 图片所在文件夹路径,后面的/一定要带上
    txtPath = "/home/aistudio/work/dataset/validation_data/lable/"  # 标签txt所在文件夹路径,后面的/一定要带上
    xmlPath = "/home/aistudio/work/dataset/validation_data/annotations/"  # xml文件保存路径,后面的/一定要带上
    makexml(picPath, txtPath, xmlPath)
2.3.3 生成训练集、测试集、验证集、标签文档
train_f = open('/home/aistudio/work/smoke_data_voc/train.txt','w')   # 生成训练文件
val_f = open('/home/aistudio/work/smoke_data_voc/val.txt' ,'w')    # 生成验证文件
text_f = open('/home/aistudio/work/smoke_data_voc/test.txt' ,'w')  # 生成验证文件

root_dir  = '/home/aistudio/work/smoke_data_voc/'   # 根目录
train_dir = root_dir + 'training_data/'             # 训练集路径
val_dir = root_dir + 'validation_data/'             # 验证集路径验证
test_dir = root_dir + 'testing_data/'               # 测试集路径

dir_dic = {train_dir:train_f, val_dir:val_f}
for data_dir, file_dir in dir_dic.items():
    path_list = list()
    for img in os.listdir(data_dir + 'smoking'):
        img_path = os.path.join(data_dir + 'smoking',img)
        xml_path = os.path.join(data_dir + 'annotations',img.replace('jpg', 'xml'))
        path_list.append((img_path, xml_path))

    for i ,content in enumerate(path_list):
        img, xml = content
        text = img + ' ' + xml + '\n'
        file_dir.write(text)

    file_dir.close()
# 生成测试集txt文件
path_list = list()
for img in os.listdir(test_dir):
    img_path = os.path.join(data_dir,img)
    path_list.append((img_path))

for i ,content in enumerate(path_list):
    img = content
    text = img + '\n'
    text_f.write(text)
text_f.close()
#生成标签文档
label = ['smoking']#设置你想检测的类别
with open(root_dir + '/label_list.txt', 'w') as f:
    for text in label:
        f.write(text+'\n')

3. 模型训练

3.1 下载PaddleDetection,安装相关依赖

# 安装依赖
# !git clone https://gitee.com/paddlepaddle/PaddleDetection.git -b release/2.6
%cd /home/aistudio/work/PaddleDetection
!pip install --upgrade pip
!pip install --user -r requirements.txt

3.2 配置文件

文件配置说明链接:https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/configannotation
使用Paddle训练模型需要对一系列所谓模块化的配置文件进行修改(当然使用官方给定的数据集应该可以一键训练),官方给出是coco数据集格式的修改说明,对于voc数据集主要修改如下:
a. /PaddleDetection/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml
metric修改为VOC、修改迭代次数:epoch、学习率:base_lr、batch_size等超参;
b. /PaddleDetection/configs/ppyolo/base/ppyolo_reader.yml
修改模型训练、评估、测试的batch_size
c. /PaddleDetection/configs/datasets/voc.yml
num_class设置成1、name修改为VOCDataSet、dataset_dir设置为对应数据集路径

单卡训练时间太长,这里使用了4卡训练,batch_size设置为24,epoch为583,base_lr修改0.0002。
可使用左侧可视化工具,VisualDL服务实进行训练可视化。

3.3 使用V100四卡进行训练

命令行加入 --use_vdl=True --vdl_log_dir=“./output” 指定VisualDL的日志路径

# 使用ppyolo模型训练
!export CUDA_VISIBLE_DEVICES=0,1,2,3
# !python tools/train.py -c ./configs/ppyolo/ppyolo_r50vd_dcn_voc.yml --eval --use_vdl=True --vdl_log_dir="./output"
!python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c ./configs/ppyolo/ppyolo_r50vd_dcn_voc.yml --eval --use_vdl=True --vdl_log_dir="./output" 

4. 模型评估

PaddleDetection也提供了tools/eval.py脚本用于评估模型,评估是可以通过-o weights=指定待评估权重。 PaddleDetection训练时添加命令 --eval,会将所有checkpoint中评估结果最好的checkpoint保存为best_model.pdparams,可以看到评估结果mAP(0.50, 11point)为 61.21%

# 模型评估
%cd /home/aistudio/work/PaddleDetection
!python -u tools/eval.py \
-c ./configs/ppyolo/ppyolo_r50vd_dcn_voc.yml \
-o weights=./output/ppyolo_r50vd_dcn_voc/model_final
/home/aistudio/work/PaddleDetection
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W0318 22:58:55.481233  9434 gpu_context.cc:244] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0318 22:58:55.486397  9434 gpu_context.cc:272] device: 0, cuDNN Version: 8.2.
[03/18 22:59:01] ppdet.utils.checkpoint INFO: Finish loading model weights: ./output/ppyolo_r50vd_dcn_voc/model_final.pdparams
[03/18 22:59:02] ppdet.engine INFO: Eval iter: 0
[03/18 22:59:10] ppdet.metrics.metrics INFO: Accumulating evaluatation results...
[03/18 22:59:10] ppdet.metrics.metrics INFO: mAP(0.50, 11point) = 61.21%
[03/18 22:59:10] ppdet.engine INFO: Total sample number: 199, average FPS: 22.408059600797294

5. 模型预测

# 预测结果
%cd /home/aistudio/work/PaddleDetection
! python tools/infer.py -c ./configs/ppyolo/ppyolo_r50vd_dcn_voc.yml --infer_img=/home/aistudio/work/smoke_data_voc/testing_data/abc089.jpg -o weights=/home/aistudio/work/PaddleDetection/output/ppyolo_r50vd_dcn_voc/model_final
# 结果可视化
import matplotlib.pyplot as plt
import cv2
img = cv2.imread("/home/aistudio/work/PaddleDetection/output/abc089.jpg")
img_ = cv2.resize(img, (300,300));
# plt.figure(figsize=(15, 15))
plt.imshow(cv2.cvtColor(img_, cv2.COLOR_BGR2RGB))
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-F0uzVBGB-1681979944290)(main_files/main_20_0.png)]

6. 模型部署

6.1 模型导出

PaddleDetection提供了模型导出工具,导出pdmodel用于模型的部署推理

## 模型导出
%cd /home/aistudio/work/PaddleDetection
! python tools/export_model.py \
    -c ./configs/ppyolo/ppyolo_r50vd_dcn_voc.yml \
    -o weights=/home/aistudio/work/PaddleDetection/output/ppyolo_r50vd_dcn_voc/model_final.pdparams \
    TestReader.inputs_def.image_shape=[3,608,608] \
    --output_dir inference_model

6.2 模型推理

# 模型推理
%cd /home/aistudio/work/PaddleDetection
! python deploy/python/infer.py \
    --model_dir=./output_inference/ppyolo_r50vd_dcn_voc \
    --image_file=/home/aistudio/work/smoke_data_voc/testing_data/abc155.jpg \
    --device=GPU \
    --thresh=0.2 

/home/aistudio/work/PaddleDetection
-----------  Running Arguments -----------
action_file: None
batch_size: 1
camera_id: -1
combine_method: nms
cpu_threads: 1
device: GPU
enable_mkldnn: False
enable_mkldnn_bfloat16: False
image_dir: None
image_file: /home/aistudio/work/smoke_data_voc/testing_data/abc155.jpg
match_metric: ios
match_threshold: 0.6
model_dir: ./output_inference/ppyolo_r50vd_dcn_voc
output_dir: output
overlap_ratio: [0.25, 0.25]
random_pad: False
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: paddle
save_images: True
save_mot_txt_per_img: False
save_mot_txts: False
save_results: False
scaled: False
slice_infer: False
slice_size: [640, 640]
threshold: 0.2
tracker_config: None
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_coco_category: False
use_dark: True
use_gpu: False
video_file: None
window_size: 50
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order: 
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
class_id:0, confidence:0.6626, left_top:[634.51,356.80],right_bottom:[811.03,490.34]
save result to: output/abc155.jpg
Test iter 0
------------------ Inference Time Info ----------------------
total_time(ms): 2229.7, img_num: 1
average latency time(ms): 2229.70, QPS: 0.448491
preprocess_time(ms): 1642.30, inference_time(ms): 587.40, postprocess_time(ms): 0.00

reprocess_time(ms): 1642.30, inference_time(ms): 587.40, postprocess_time(ms): 0.00

# 推理结果可视化
import matplotlib.pyplot as plt
import cv2
img = cv2.imread("/home/aistudio/work/PaddleDetection/output/abc155.jpg")
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-PJ5HWIUh-1681979944291)(main_files/main_26_0.png)]

7. 总结

  • 在此项目中使用ppyolo为预训练模型,最终训练得到模型mAP(0.50)为61.21%,在V100部署后推理速度为587.40ms(888x1024),这里实际部署中还需要模型压缩进行加速;
  • 在完成项目的过程还尝试了使用yolov5进行训练,最终取得模型mAP(0.50)为75.1%,模型路径:/home/aistudio/work/YOLOv5-Paddle/yolosmokes/s1202/weights
  • 在模型部署时原本是打算使用tensorRT进行部署,转ONNX过程中发现两个OP不支持:eformableconv,matrixnms,调整版本后没能解决,有点遗憾;
  • 通过此次项目实践,在数据集标注、模型训练、部署,确实学到了很多知识,感谢aistudio平台,提供了便捷的方法,可以让小白迅速上手;

作者:高壮

导师:刘建建

此文章为转载
原文链接

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐