PPYOLOE:遥感场景下的小目标检测与部署

百度飞桨针对小目标检测的典型场景,提供了PP-YOLOE Smalldet一键实现切图配置与训练。

详细文档可参考:PP-YOLOE Smalldet 检测模型

在本项目中,我们展示了从模型训练到部署的整个流程。并给出了以遥感目标检测为背景的典型应用案例,帮助用户快速上手和理解整个PP-YOLOE Smalldet项目。

项目背景

数据集介绍

NWPU VHR-10数据集包含800个高分辨率的卫星图像,这些图像是从Google Earth和Vaihingen数据集裁剪而来的,然后由专家手动注释。数据集分成10类(飞机,轮船,储罐,棒球场,网球场,篮球场,地面跑道,港口,桥梁和车辆)。

它由715幅RGB图像和85幅锐化彩色红外图像组成。其中715幅RGB图像采集自谷歌地球,空间分辨率从0.5m到2m不等。85幅经过pan‐锐化的红外图像,空间分辨率为0.08m,来自Vaihingen数据。

该数据集共包含3775个对象实例,其中包括757架飞机、390个棒球方块、159个篮球场、124座桥梁、224个港口、163个田径场、302艘船、655个储罐、524个网球场和477辆汽车,这些对象实例都是用水平边框手工标注的。

原始数据集包含以下文件:

  • negative image set:包含150个不包含给定对象类别的任何目标的图像
  • positive image set:650个图像,每个图像至少包含一个要检测的目标
  • ground truth:包含650个单独的文本文件,每个对应于“正图像集”文件夹中的图像。这些文本文件的每一行都以以下格式定义了ground truth边界框:
(x1,y1),(x2,y2),a
其中(x1,y1)表示边界框的左上角坐标,(x2,y2)表示边界框的右下角坐标,
a是对象类别(1-飞机,2-轮船,3-储罐,4-棒球场,5-网球场,6-篮球场,7-田径场,8-港口,9-桥梁,10-车辆)。

https://static.ligongku.com/resource/intro/img/74c9a00f9b33415e946929cadf12dbeb.jpg

参考文献:

[1] Gong Cheng, Junwei Han, Peicheng Zhou, Lei Guo. Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS Journal of Photogrammetry and Remote Sensing, 98: 119-132, 2014.
[2] Gong Cheng, Junwei Han. A survey on object detection in optical remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 117: 11-28, 2016.
[3] Gong Cheng, Peicheng Zhou, Junwei Han. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 54(12): 7405-7415, 2016.
## NWPU VHR-10数据集的标注经简单处理即可转换为COCO格式
# !git clone https://github.com/lavish619/DeepLab_NWPU-VHR-10_Dataset_coco

PP-YOLOE Smalldet模型库

PP-YOLOE Smalldet提供了以下结合切图工具的模型库,读者可以根据需要选用。

模型数据集SLICE_SIZEOVERLAP_RATIO类别数mAPval
0.5:0.95
APval
0.5
下载链接配置文件
PP-YOLOE-lXview4000.256014.526.8下载链接配置文件
PP-YOLOE-lDOTA5000.251546.872.6下载链接配置文件
PP-YOLOE-lVisDrone5000.251029.748.5下载链接配置文件
  • SLICE_SIZE表示使用SAHI工具切图后子图的大小(SLICE_SIZE*SLICE_SIZE);OVERLAP_RATIO表示切图重叠率。
  • PP-YOLOE模型训练过程中使用8 GPUs进行混合精度训练,如果GPU卡数或者batch size发生了改变,你需要按照公式 lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault) 调整学习率。

不过在本项目中,使用原图训练也可以获得较好的效果,因此本文选用原图训练。读者也可以进一步对比切图训练下的预测效果。

(可选)SAHI切图工具介绍

SAHI是用于超大图片中对小目标检测的切片辅助超推理库。该库可直接用于现有的网络,而不需要重新设计和训练模型,使用十分方便。

在这里插入图片描述

显然,对于超大分辨率的数据集,通过切图工具,我们可以在不重新训练模型并且不需要更大的GPU内存分配的情况下,检测图中较小的对象。

由于PaddleDetection库还在持续迭代中,读者可以期待结合SAHI切图工具后,小目标检测的表现。

配置运行环境

环境要求: PaddleDetection develop版本

通过以下命令获取PaddleDetection套件代码

# 引入PaddleX,可以安装PaddleDet所需关键依赖包
!pip install paddlex
!git clone https://gitee.com/paddlepaddle/PaddleDetection.git
%cd PaddleDetection
!git checkout develop

如何构建模型

PP-YOLOE Smalldet可以实现小目标检测的训练和部署流程,在这一部分中,我们以遥感目标检测模型为例,展示将ppyoloe_crn_l_80e_sliced_visdrone_640_025这一个模型从训练到导出至部署可用的模型的全流程。详细细节可以参考Github文档

(可选)PaddleX训练参照基线模型

俗话说,没有对比就没有伤害。我们可以先简单用PaddleX跑一个基线模型,看看在没有应用小目标检测优化时,在NWPU VHR-10数据集上能跑出来的训练效果。

import paddlex as pdx
from paddlex import transforms as T

# 定义训练和验证时的transforms
# API说明:https://github.com/PaddlePaddle/PaddleX/blob/develop/docs/apis/transforms/transforms.md
train_transforms = T.Compose([
    T.MixupImage(mixup_epoch=-1), T.RandomDistort(),
    T.RandomExpand(im_padding_value=[123.675, 116.28, 103.53]), T.RandomCrop(),
    T.RandomHorizontalFlip(), T.BatchRandomResize(
        target_sizes=[
            320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704,
            736, 768
        ],
        interp='RANDOM'), T.Normalize(
            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

eval_transforms = T.Compose([
    T.Resize(
        target_size=640, interp='CUBIC'), T.Normalize(
            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# 定义训练和验证所用的数据集
# API说明:https://github.com/PaddlePaddle/PaddleX/blob/develop/docs/apis/datasets.md
train_dataset = pdx.datasets.CocoDetection(
    data_dir='/home/aistudio/NV10-dataset/images',
    ann_file='/home/aistudio/DeepLab_NWPU-VHR-10_Dataset_coco/NWPU VHR-10_dataset_coco/instances_train2017.json',
    transforms=train_transforms,
    num_workers='auto',
    shuffle=True,
    allow_empty=False,
    empty_ratio=1.0,
)

eval_dataset = pdx.datasets.CocoDetection(
    data_dir='/home/aistudio/NV10-dataset/images',
    ann_file='/home/aistudio/DeepLab_NWPU-VHR-10_Dataset_coco/NWPU VHR-10_dataset_coco/instances_val2017.json',
    transforms=eval_transforms,
    num_workers='auto',
    shuffle=False,
    allow_empty=False,
    empty_ratio=1.0,
)

# 初始化模型,并进行训练
# 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/PaddleX/blob/develop/docs/visualdl.md
num_classes = len(train_dataset.labels)
model = pdx.det.PPYOLOv2(num_classes=num_classes, backbone='ResNet50_vd_dcn')

# API说明:https://github.com/PaddlePaddle/PaddleX/blob/develop/docs/apis/models/detection.md
# 各参数介绍与调整说明:https://github.com/PaddlePaddle/PaddleX/blob/develop/docs/parameters.md
model.train(
    num_epochs=170,
    train_dataset=train_dataset,
    train_batch_size=8,
    eval_dataset=eval_dataset,
    pretrain_weights='COCO',
    learning_rate=0.005 / 12,
    warmup_steps=1000,
    warmup_start_lr=0.0,
    lr_decay_epochs=[105, 135, 150],
    save_interval_epochs=5,
    save_dir='output/ppyolov2_r50vd_dcn')

模型训练

训练模型主要包括准备训练数据以及启动训练命令,可以按照下面的命令执行。

调整配置文件时,主要是修改学习率lr,以及配置COCO格式遥感数据集的路径、目标类别数。

ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml文件:

epoch: 80
LearningRate:
  base_lr: 0.00125
  schedulers:
    - !CosineDecay
      max_epochs: 96
    - !LinearWarmup
      start_factor: 0.
      epochs: 1

visdrone_sliced_640_025_detection.yml文件:

metric: COCO
num_classes: 10

TrainDataset:
  !COCODataSet
    image_dir: /home/aistudio/NV10-dataset/images
    anno_path: /home/aistudio/DeepLab_NWPU-VHR-10_Dataset_coco/NWPU VHR-10_dataset_coco/instances_train2017.json
    dataset_dir: /home/aistudio/NV10-dataset
    data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']

EvalDataset:
  !COCODataSet
    image_dir: /home/aistudio/NV10-dataset/images
    anno_path: /home/aistudio/DeepLab_NWPU-VHR-10_Dataset_coco/NWPU VHR-10_dataset_coco/instances_val2017.json
    dataset_dir: /home/aistudio/NV10-dataset

TestDataset:
  !ImageFolder
    anno_path: /home/aistudio/DeepLab_NWPU-VHR-10_Dataset_coco/NWPU VHR-10_dataset_coco/instances_val2017.json
    dataset_dir: /home/aistudio/NV10-dataset/images
# 覆盖配置文件
!cp ../ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml
!cp ../visdrone_sliced_640_025_detection.yml configs/smalldet/_base_/visdrone_sliced_640_025_detection.yml
# 开始训练
!python tools/train.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml --use_vdl=True --vdl_log_dir=./sliced_visdrone/ --eval

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

从上面的训练过程监控可以看出,模型收敛速度其实非常快,大约20分钟就跑出了很好的mAP。

完成训练后,我们的模型默认保存在output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/。

模型评估

在训练模型以后,我们可以通过运行评估命令来得到模型的精度,以确认训练的效果。评估可以参考以下命令执行。

这里使用了我们已经训练好的模型。如希望使用自己训练的模型,请对应将weights=后的值更改为对应模型.pdparams文件的存储路径。

# 模型评估
!python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
Warning: Unable to use OC-SORT, please install filterpy, for example: `pip install filterpy`, see https://github.com/rlabbe/filterpy
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W0822 22:35:03.347236 17634 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0822 22:35:03.351995 17634 device_context.cc:465] device: 0, cuDNN Version: 7.6.
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[08/22 22:35:08] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
[08/22 22:35:08] ppdet.engine INFO: Eval iter: 0
[08/22 22:35:13] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json.
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[08/22 22:35:13] ppdet.metrics.coco_utils INFO: Start evaluate...
Loading and preparing results...
DONE (t=0.54s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=2.27s).
Accumulating evaluation results...
DONE (t=0.35s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.771
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.969
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.882
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.761
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.754
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.790
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.289
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.709
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.822
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.777
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.806
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.851
[08/22 22:35:17] ppdet.engine INFO: Total sample number: 130, averge FPS: 29.180156020800634

模型预测

这里我们将训练好的模型对着整个NWPU VHR-10数据集来一番批量预测,这些预测结果会记录在VisualDL中,可以很方便地与原图对比,观察预测效果。

!python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model --infer_dir ../NV10-dataset/images --use_vdl=True --vdl_log_dir=./sliced_visdrone/image

在这里插入图片描述

模型导出

.pdparams只包括了模型的参数数据,实际部署还需要执行导出步骤。导出步骤可以参考下面列举的步骤:

注意,这里使用了我们已经训练好的模型。如希望使用自己训练的模型,请对应将weights=后的值更改为对应模型.pdparams文件的存储路径。如果没有指定--output_dir,那么导出的模型将默认存储在output_inference/路径下。

!python tools/export_model.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
Warning: Unable to use OC-SORT, please install filterpy, for example: `pip install filterpy`, see https://github.com/rlabbe/filterpy
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
[08/22 22:45:35] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[08/22 22:45:36] ppdet.engine INFO: Export inference config file to output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025/infer_cfg.yml
[08/22 22:45:44] ppdet.engine INFO: Export model and saved in output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025

至此,我们就完成了遥感小目标检测模型的从训练到导出的过程。接下来,看看该模型使用Paddle Inference部署时的具体性能表现。

模型部署与速度测试

模型部署

# 选一张验证集图片测试部署效果
!python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025 --image_file=../NV10-dataset/images/379.jpg --run_mode=paddle --device=gpu
-----------  Running Arguments -----------
action_file: None
batch_size: 1
camera_id: -1
cpu_threads: 1
device: gpu
enable_mkldnn: False
enable_mkldnn_bfloat16: False
image_dir: None
image_file: ../NV10-dataset/images/379.jpg
model_dir: output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025
output_dir: output
random_pad: False
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: paddle
save_images: False
save_mot_txt_per_img: False
save_mot_txts: False
save_results: False
scaled: False
threshold: 0.5
tracker_config: None
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: False
video_file: None
window_size: 50
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order: 
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
class_id:9, confidence:0.8861, left_top:[279.37,190.83],right_bottom:[330.01,227.88]
class_id:9, confidence:0.8860, left_top:[431.22,252.48],right_bottom:[472.41,300.22]
class_id:9, confidence:0.7865, left_top:[367.85,140.09],right_bottom:[417.63,201.48]
save result to: output/379.jpg
Test iter 0
------------------ Inference Time Info ----------------------
total_time(ms): 1844.5, img_num: 1
average latency time(ms): 1844.50, QPS: 0.542152
preprocess_time(ms): 1823.90, inference_time(ms): 20.50, postprocess_time(ms): 0.10

在这里插入图片描述

速度测试

为了公平起见,在模型库中的速度测试结果均为不包含数据预处理和模型输出后处理(NMS)的数据(与YOLOv4(AlexyAB)测试方法一致),需要在导出模型时指定-o exclude_nms=True.

使用Paddle Inference但不使用TensorRT进行测速,效果如下:

!pip install gputil
# 导出模型
!python tools/export_model.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/best_model.pdparams exclude_nms=True
!python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025 --image_file=../NV10-dataset/images/379.jpg --run_mode=paddle --device=gpu --run_benchmark=True
-----------  Running Arguments -----------
action_file: None
batch_size: 1
camera_id: -1
cpu_threads: 1
device: gpu
enable_mkldnn: False
enable_mkldnn_bfloat16: False
image_dir: None
image_file: ../NV10-dataset/images/379.jpg
model_dir: output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025
output_dir: output
random_pad: False
reid_batch_size: 50
reid_model_dir: None
run_benchmark: True
run_mode: paddle
save_images: False
save_mot_txt_per_img: False
save_mot_txts: False
save_results: False
scaled: False
threshold: 0.5
tracker_config: None
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: False
video_file: None
window_size: 50
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order: 
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
Test iter 0
2022-08-23 00:25:00,281 - benchmark_utils - INFO - Paddle Inference benchmark log will be saved to /home/aistudio/PaddleDetection/deploy/python/../../output/ppyoloe_crn_l_80e_sliced_visdrone_640_025.log
2022-08-23 00:25:00,282 - benchmark_utils - INFO - 

2022-08-23 00:25:00,282 - benchmark_utils - INFO - ---------------------- Paddle info ----------------------
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] paddle_version: 2.2.2
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] paddle_commit: b031c389938bfa15e15bb20494c76f86289d77b0
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] paddle_branch: HEAD
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] log_api_version: 1.0.3
2022-08-23 00:25:00,282 - benchmark_utils - INFO - ----------------------- Conf info -----------------------
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] runtime_device: gpu
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] ir_optim: True
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] enable_memory_optim: True
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] enable_tensorrt: False
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] enable_mkldnn: False
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] cpu_math_library_num_threads: 1
2022-08-23 00:25:00,282 - benchmark_utils - INFO - ----------------------- Model info ----------------------
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] model_name: ppyoloe_crn_l_80e_sliced_visdrone_640_025
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] precision: paddle
2022-08-23 00:25:00,282 - benchmark_utils - INFO - ----------------------- Data info -----------------------
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] batch_size: 1
2022-08-23 00:25:00,282 - benchmark_utils - INFO - [DET] input_shape: dynamic_shape
2022-08-23 00:25:00,283 - benchmark_utils - INFO - [DET] data_num: 1
2022-08-23 00:25:00,283 - benchmark_utils - INFO - ----------------------- Perf info -----------------------
2022-08-23 00:25:00,283 - benchmark_utils - INFO - [DET] cpu_rss(MB): 2765, cpu_vms: 0, cpu_shared_mb: 0, cpu_dirty_mb: 0, cpu_util: 0%
2022-08-23 00:25:00,283 - benchmark_utils - INFO - [DET] gpu_rss(MB): 1354, gpu_util: 0.0%, gpu_mem_util: 0%
2022-08-23 00:25:00,283 - benchmark_utils - INFO - [DET] total time spent(s): 0.0381
2022-08-23 00:25:00,283 - benchmark_utils - INFO - [DET] preprocess_time(ms): 25.0, inference_time(ms): 13.1, postprocess_time(ms): 0.0

此文章为搬运
原项目链接

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐