PPYOLOE:遥感场景下的小目标检测与部署(切图版)

百度飞桨针对小目标检测的典型场景,提供了PP-YOLOE Smalldet一键实现切图配置与训练。

详细文档可参考:PP-YOLOE Smalldet 检测模型

在本项目中,我们展示了从模型训练到部署的整个流程。并给出了以遥感目标检测为背景的典型应用案例,帮助用户快速上手和理解整个PP-YOLOE Smalldet项目。

0 项目背景

0.1 数据集介绍

NWPU VHR-10数据集包含800个高分辨率的卫星图像,这些图像是从Google Earth和Vaihingen数据集裁剪而来的,然后由专家手动注释。数据集分成10类(飞机,轮船,储罐,棒球场,网球场,篮球场,地面跑道,港口,桥梁和车辆)。

它由715幅RGB图像和85幅锐化彩色红外图像组成。其中715幅RGB图像采集自谷歌地球,空间分辨率从0.5m到2m不等。85幅经过pan‐锐化的红外图像,空间分辨率为0.08m,来自Vaihingen数据。

该数据集共包含3775个对象实例,其中包括757架飞机、390个棒球方块、159个篮球场、124座桥梁、224个港口、163个田径场、302艘船、655个储罐、524个网球场和477辆汽车,这些对象实例都是用水平边框手工标注的。

原始数据集包含以下文件:

  • negative image set:包含150个不包含给定对象类别的任何目标的图像
  • positive image set:650个图像,每个图像至少包含一个要检测的目标
  • ground truth:包含650个单独的文本文件,每个对应于“正图像集”文件夹中的图像。这些文本文件的每一行都以以下格式定义了ground truth边界框:
(x1,y1),(x2,y2),a
其中(x1,y1)表示边界框的左上角坐标,(x2,y2)表示边界框的右下角坐标,
a是对象类别(1-飞机,2-轮船,3-储罐,4-棒球场,5-网球场,6-篮球场,7-田径场,8-港口,9-桥梁,10-车辆)。

https://static.ligongku.com/resource/intro/img/74c9a00f9b33415e946929cadf12dbeb.jpg

参考文献:

[1] Gong Cheng, Junwei Han, Peicheng Zhou, Lei Guo. Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS Journal of Photogrammetry and Remote Sensing, 98: 119-132, 2014.
[2] Gong Cheng, Junwei Han. A survey on object detection in optical remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 117: 11-28, 2016.
[3] Gong Cheng, Peicheng Zhou, Junwei Han. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 54(12): 7405-7415, 2016.
## NWPU VHR-10数据集的标注经简单处理即可转换为COCO格式
# !git clone https://github.com/lavish619/DeepLab_NWPU-VHR-10_Dataset_coco

0.2 PP-YOLOE Smalldet模型库

PP-YOLOE Smalldet提供了以下结合切图工具的模型库,读者可以根据需要选用。

模型 数据集 SLICE_SIZE OVERLAP_RATIO 类别数 mAPval
0.5:0.95
APval
0.5
下载链接 配置文件
PP-YOLOE-l VisDrone-DET 640 0.25 10 29.7 48.5 下载链接 配置文件
PP-YOLOE-l (Assembled) VisDrone-DET 640 0.25 10 37.2 59.4 下载链接 配置文件
  • SLICE_SIZE表示使用SAHI工具切图后子图的边长大小,OVERLAP_RATIO表示切图的子图之间的重叠率,DOTA水平框和Xview数据集均是切图后训练,AP指标为切图后的子图val上的指标。
  • PP-YOLOE模型训练过程中使用8 GPUs进行混合精度训练,如果GPU卡数或者batch size发生了改变,你需要按照公式 lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault) 调整学习率。
  • 自动切图和拼图的推理预测需添加设置--slice_infer
  • Assembled表示自动切图和拼图

在本项目中,我们将原图训练与切图训练进行对比,看看在遥感数据集上,切图拼图的目标检测解决方案会有怎样的表现。

0.3 SAHI切图工具介绍

SAHI是用于超大图片中对小目标检测的切片辅助超推理库。该库可直接用于现有的网络,而不需要重新设计和训练模型,使用十分方便。

在这里插入图片描述

显然,对于超大分辨率的数据集,通过切图工具,我们可以在不重新训练模型并且不需要更大的GPU内存分配的情况下,检测图中较小的对象。

1 配置运行环境

环境要求: PaddleDetection release/2.5版本

通过以下命令获取PaddleDetection套件代码

# 引入PaddleX,可以安装PaddleDet所需关键依赖包
!pip install paddlex
# 参考SAHI installation进行安装
!pip install sahi
!git clone https://gitee.com/paddlepaddle/PaddleDetection.git
%cd PaddleDetection
!git checkout release/2.5

1.1 一键切图

!python tools/slice_image.py --image_dir /home/aistudio/NV10-dataset/images --json_path /home/aistudio/DeepLab_NWPU-VHR-10_Dataset_coco/NWPU*/instances_train2017.json --output_dir dataset/NV10_sliced --slice_size 500 --overlap_ratio 0.25
!python tools/slice_image.py --image_dir /home/aistudio/NV10-dataset/images --json_path /home/aistudio/DeepLab_NWPU-VHR-10_Dataset_coco/NWPU*/instances_val2017.json --output_dir dataset/NV10_sliced --slice_size 500 --overlap_ratio 0.25

2 模型训练

训练模型主要包括准备训练数据以及启动训练命令,可以按照下面的命令执行。

# 覆盖配置文件
!cp ../ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml
!cp ../visdrone_sliced_640_025_detection.yml configs/smalldet/_base_/visdrone_sliced_640_025_detection.yml
# 开始训练
!python tools/train.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml --use_vdl=True --vdl_log_dir=./sliced_visdrone/ --eval

原图训练和子图训练的对比情况:

  • mAP指标
- loss指标

从上面的训练过程监控可以看出,子图训练耗时明显更长,大约是原图训练的4倍。但是训练效果却有些“翻车”,子图训练效果没有明显好于原图训练。下面,我们通过模型评估来进一步研究这个问题。

3 模型评估

在训练模型以后,我们可以通过运行评估命令来得到模型的精度,以确认训练的效果。评估可以参考以下命令执行。

这里使用了我们已经训练好的模型。如希望使用自己训练的模型,请对应将weights=后的值更改为对应模型.pdparams文件的存储路径。

import warnings
warnings.filterwarnings('ignore')
!cp ../visdrone_sliced_640_025_detection.yml configs/smalldet/_base_/visdrone_sliced_640_025_detection.yml
# 子图评估
!python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
Warning: Unable to use OC-SORT, please install filterpy, for example: `pip install filterpy`, see https://github.com/rlabbe/filterpy
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W0907 08:28:38.072043  3216 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0907 08:28:38.075594  3216 device_context.cc:465] device: 0, cuDNN Version: 7.6.
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
[09/07 08:28:43] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
[09/07 08:28:43] ppdet.engine INFO: Eval iter: 0
[09/07 08:28:48] ppdet.engine INFO: Eval iter: 100
[09/07 08:28:53] ppdet.engine INFO: Eval iter: 200
[09/07 08:28:57] ppdet.engine INFO: Eval iter: 300
[09/07 08:29:00] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json.
loading annotations into memory...
Done (t=0.15s)
creating index...
index created!
[09/07 08:29:00] ppdet.metrics.coco_utils INFO: Start evaluate...
Loading and preparing results...
DONE (t=0.93s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=5.03s).
Accumulating evaluation results...
DONE (t=1.13s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.765
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.965
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.849
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.738
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.767
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.789
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.337
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.768
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.825
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.779
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.820
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.873
[09/07 08:29:08] ppdet.engine INFO: Total sample number: 634, averge FPS: 40.47384146981183
!cp ../ppyoloe_crn_l_80e_sliced_visdrone_640_025-Copy1.yml configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025-Copy1.yml
# 原图评估
!python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025-Copy1.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
Warning: Unable to use OC-SORT, please install filterpy, for example: `pip install filterpy`, see https://github.com/rlabbe/filterpy
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W0907 08:29:12.253398  3421 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0907 08:29:12.256899  3421 device_context.cc:465] device: 0, cuDNN Version: 7.6.
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[09/07 08:29:16] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
[09/07 08:29:17] ppdet.engine INFO: Eval iter: 0
[09/07 08:29:21] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json.
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[09/07 08:29:21] ppdet.metrics.coco_utils INFO: Start evaluate...
Loading and preparing results...
DONE (t=0.24s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=1.68s).
Accumulating evaluation results...
DONE (t=0.29s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.764
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.978
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.877
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.728
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.753
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.784
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.291
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.697
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.812
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.772
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.798
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.857
[09/07 08:29:23] ppdet.engine INFO: Total sample number: 130, averge FPS: 35.9806195698423
# 子图拼图评估
!cp ../ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml  configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml 
!python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams  --slice_infer --combine_method=nms --match_threshold=0.6 --match_metric=ios
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
Warning: Unable to use OC-SORT, please install filterpy, for example: `pip install filterpy`, see https://github.com/rlabbe/filterpy
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W0907 08:29:27.680362  3572 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0907 08:29:27.684113  3572 device_context.cc:465] device: 0, cuDNN Version: 7.6.
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
[09/07 08:29:32] ppdet.data.source.coco INFO: 714 samples and slice to 714 sub_samples in file /home/aistudio/PaddleDetection/dataset/NV10_sliced/instances_val2017_500_025.json
[09/07 08:29:34] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
[09/07 08:29:34] ppdet.engine INFO: Eval iter: 0
[09/07 08:29:40] ppdet.engine INFO: Eval iter: 100
[09/07 08:29:46] ppdet.engine INFO: Eval iter: 200
[09/07 08:29:52] ppdet.engine INFO: Eval iter: 300
[09/07 08:29:57] ppdet.engine INFO: Eval iter: 400
[09/07 08:30:03] ppdet.engine INFO: Eval iter: 500
[09/07 08:30:08] ppdet.engine INFO: Eval iter: 600
imdecode_(''): can't read header: OpenCV(4.6.0) /io/opencv/modules/imgcodecs/src/grfmt_bmp.cpp:108: error: (-215:Assertion failed) m_rle_code_ >= 0 && m_rle_code_ <= BMP_BITFIELDS in function 'readHeader'

[09/07 08:30:13] ppdet.engine INFO: Eval iter: 700
[09/07 08:30:15] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json.
loading annotations into memory...
Done (t=0.13s)
creating index...
index created!
[09/07 08:30:15] ppdet.metrics.coco_utils INFO: Start evaluate...
Loading and preparing results...
DONE (t=0.47s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=3.10s).
Accumulating evaluation results...
DONE (t=0.89s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.758
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.954
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.841
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.726
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.763
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.758
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.337
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.761
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.811
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.764
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.812
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.819
[09/07 08:30:20] ppdet.engine INFO: Total sample number: 714, averge FPS: 17.67189514404967
!python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams  --slice_infer --combine_method=nms --match_threshold=0.6 --match_metric=iou
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
Warning: Unable to use OC-SORT, please install filterpy, for example: `pip install filterpy`, see https://github.com/rlabbe/filterpy
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W0907 08:30:24.031093  3817 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0907 08:30:24.034633  3817 device_context.cc:465] device: 0, cuDNN Version: 7.6.
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
[09/07 08:30:28] ppdet.data.source.coco INFO: 714 samples and slice to 714 sub_samples in file /home/aistudio/PaddleDetection/dataset/NV10_sliced/instances_val2017_500_025.json
[09/07 08:30:30] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
[09/07 08:30:30] ppdet.engine INFO: Eval iter: 0
[09/07 08:30:38] ppdet.engine INFO: Eval iter: 100
[09/07 08:30:46] ppdet.engine INFO: Eval iter: 200
[09/07 08:30:54] ppdet.engine INFO: Eval iter: 300
[09/07 08:31:01] ppdet.engine INFO: Eval iter: 400
[09/07 08:31:10] ppdet.engine INFO: Eval iter: 500
[09/07 08:31:18] ppdet.engine INFO: Eval iter: 600
imdecode_(''): can't read header: OpenCV(4.6.0) /io/opencv/modules/imgcodecs/src/grfmt_bmp.cpp:108: error: (-215:Assertion failed) m_rle_code_ >= 0 && m_rle_code_ <= BMP_BITFIELDS in function 'readHeader'

[09/07 08:31:24] ppdet.engine INFO: Eval iter: 700
[09/07 08:31:27] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json.
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
[09/07 08:31:28] ppdet.metrics.coco_utils INFO: Start evaluate...
Loading and preparing results...
DONE (t=0.99s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=5.11s).
Accumulating evaluation results...
DONE (t=1.26s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.764
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.963
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.847
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.737
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.766
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.786
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.337
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.768
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.825
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.778
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.820
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.873
[09/07 08:31:35] ppdet.engine INFO: Total sample number: 714, averge FPS: 12.846058157834854

从上面的评估对比中我们可以发现,尽管子图切图再拼图的方式显著提高了各类目标的召回率,但是整体mAP表现非但没有优势,甚至还是明显劣势。

其原因在于,训练的NWPU VHR-10数据集,其实图片分辨率并不算大,比如这样:
在这里插入图片描述

在这种情况下,按照500*500进行切图也许并不是最佳选择。

4 模型预测

这里我们将训练好的模型对着整个NWPU VHR-10数据集来一番批量预测,和通常的直接预测相比,本项目用的是子图拼图预测,这些预测结果会记录在VisualDL中,可以很方便地与原图对比,观察预测效果。

!python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model --infer_dir ../NV10-dataset/images --use_vdl=True --vdl_log_dir=./sliced_visdrone/image --draw_threshold=0.25 --slice_infer --slice_size 500 500 --overlap_ratio 0.25 0.25 --combine_method=nms --match_threshold=0.6 --match_metric=ios

在这里插入图片描述

翻阅子图拼图模型在测试集上的表现,我们会发现其实子图拼图对相对较小的目标确实有很好的预测效果。但是,对于数据中的“大目标”,模型表现相当一般。

5 模型导出

.pdparams只包括了模型的参数数据,实际部署还需要执行导出步骤。导出步骤可以参考下面列举的步骤:

注意,这里使用了我们已经训练好的模型。如希望使用自己训练的模型,请对应将weights=后的值更改为对应模型.pdparams文件的存储路径。如果没有指定--output_dir,那么导出的模型将默认存储在output_inference/路径下。

!python tools/export_model.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
Warning: Unable to use OC-SORT, please install filterpy, for example: `pip install filterpy`, see https://github.com/rlabbe/filterpy
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
[09/07 08:33:44] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[09/07 08:33:45] ppdet.engine INFO: Export inference config file to output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025/infer_cfg.yml
[09/07 08:33:54] ppdet.engine INFO: Export model and saved in output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025

至此,我们就完成了遥感小目标检测模型的从训练到导出的过程。接下来,看看该模型使用Paddle Inference部署时的具体性能表现,实现的方式仍是子图拼图预测。

6 模型部署

# 选一张验证集图片测试部署效果
!python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025 --image_file=../NV10-dataset/images/379.jpg --device=GPU --save_images=True --threshold=0.25  --slice_infer --slice_size 500 500 --overlap_ratio 0.25 0.25 --combine_method=nms --match_threshold=0.6 --match_metric=ios
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
-----------  Running Arguments -----------
action_file: None
batch_size: 1
camera_id: -1
combine_method: nms
cpu_threads: 1
device: GPU
enable_mkldnn: False
enable_mkldnn_bfloat16: False
image_dir: None
image_file: ../NV10-dataset/images/379.jpg
match_metric: ios
match_threshold: 0.6
model_dir: output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025
output_dir: output
overlap_ratio: [0.25, 0.25]
random_pad: False
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: paddle
save_images: True
save_mot_txt_per_img: False
save_mot_txts: False
save_results: False
scaled: False
slice_infer: True
slice_size: [500, 500]
threshold: 0.25
tracker_config: None
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_coco_category: False
use_dark: True
use_gpu: False
video_file: None
window_size: 50
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order: 
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
/home/aistudio/PaddleDetection/deploy/python/utils.py:360: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  suppressed = np.zeros((ndets), dtype=np.int)
class_id:9, confidence:0.9250, left_top:[431.43,252.04],right_bottom:[471.10,300.46]
/home/aistudio/PaddleDetection/deploy/python/visualize.py:162: DeprecationWarning: textsize is deprecated and will be removed in Pillow 10 (2023-07-01). Use textbbox or textlength instead.
  tw, th = draw.textsize(text)
class_id:9, confidence:0.9043, left_top:[369.16,140.31],right_bottom:[417.60,201.05]
/home/aistudio/PaddleDetection/deploy/python/visualize.py:162: DeprecationWarning: textsize is deprecated and will be removed in Pillow 10 (2023-07-01). Use textbbox or textlength instead.
  tw, th = draw.textsize(text)
class_id:9, confidence:0.9004, left_top:[278.68,188.63],right_bottom:[330.42,228.34]
/home/aistudio/PaddleDetection/deploy/python/visualize.py:162: DeprecationWarning: textsize is deprecated and will be removed in Pillow 10 (2023-07-01). Use textbbox or textlength instead.
  tw, th = draw.textsize(text)
save result to: output/379.jpg
Test iter 0
------------------ Inference Time Info ----------------------
total_time(ms): 1688.0, img_num: 1
average latency time(ms): 1688.00, QPS: 0.592417
preprocess_time(ms): 1659.40, inference_time(ms): 28.50, postprocess_time(ms): 0.10

在这里插入图片描述

7 结论

使用PP-YOLOE Smalldet 检测模型提供的拼图切图工具,可以高效地完成大分辨率图片的切图训练、拼图预测/部署,从而有效地提高对小目标的识别效果。

但是,切图拼图这种做法,在处理较大的目标时,可能由于nms设置的原因,效果相当一般。因此,如果一张图中,既有大目标又有小目标,读者在是否进行切图这样选择上,建议要谨慎处理。

同时,我们也需要清楚,在使用切图训练和预测时,调参的工作是必不可少的。

对于遥感目标检测这个场景,切图会导致预测速度降到原图预测的1/4,并且(因为大目标预测偏差)整体效果并没有提升,总体上看,该数据集不一定需要进行切图操作。前置项目PPYOLOE:遥感场景下的小目标检测与部署已经是比较合适的解决方案了。

此文章为搬运
原项目链接

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐