PP-Tracking之手把手玩转单镜头行人追踪
PP-Tracking是基于飞桨深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业。
PP-Tracking之手把手玩转单镜头行人追踪
PP-Tracking是基于飞桨深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,**PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业应用,同时提供可视化开发界面。**模型集成多目标跟踪,目标检测,ReID轻量级算法,进一步提升PP-Tracking在服务器端部署性能。同时支持python,C++部署,适配Linux,Nvidia Jetson多平台环境。
在如下示例中,将介绍如何使用示例代码基于您在BML中已创建的数据集来完成单镜头跟踪模型的训练,评估和推理。以及多镜头的部署。
话不多说,我们先看看结果
第一步:环境准备
- 下载开源代码
- 安装必要依赖
!cd work/ &&git clone https://gitee.com/paddlepaddle/PaddleDetection.git -b develop
Cloning into 'PaddleDetection'...
remote: Enumerating objects: 760, done.[K
remote: Counting objects: 100% (760/760), done.[K
remote: Compressing objects: 100% (418/418), done.[K
remote: Total 20290 (delta 493), reused 508 (delta 342), pack-reused 19530[K
Receiving objects: 100% (20290/20290), 201.01 MiB | 31.58 MiB/s, done.
Resolving deltas: 100% (15042/15042), done.
Checking connectivity... done.
!cd work/PaddleDetection/ && pip install -r requirements.txt && python setup.py install
Finished processing dependencies for paddledet==2.3.0on3.7/site-packagesinnn/binn120-env/bin-py3.7.eggeling/backbones/lite_hrnet.py:702: SyntaxWarning: assertion is always true, perhaps remove parentheses? terminaltables-3.1.0 typeguard-2.13.2 xmltodict-0.12.0
第二步:数据集准备
修改mot的配置文件,文件目录如下
整理之前:
MOT16
└——————train
└——————test
整理之后:
MOT16
|——————images
| └——————train
| └——————test
└——————labels_with_ids
└——————train
详细参考MOT数据准备文档
本文中,我们不需要直接下载数据集,这个项目已经直接引用平台用户上传的MOT-16数据集 ,如果大家需要也可以下载到本地进行训练
!mv /home/aistudio/data/data118993/MOT16.zip
!cd work/PaddleDetection/dataset/mot && unzip MOT16.zip -d MOT16
inflating: MOT16/train/MOT16-13/img1/000750.jpg
1. 生成labels_with_ids
创建images文件,分别将test文件夹和train文件夹的目录移至images文件夹下,并执行标签生成脚本
!cd work/PaddleDetection/dataset/mot/MOT16/ && mkdir -p images
!cd work/PaddleDetection/dataset/mot/MOT16 && mv ./train ./images && mv ./test ./images
!cd work/PaddleDetection/dataset/mot && python gen_labels_MOT.py
2. 生成mot16.train文件并且复制到 image_lists下面
import glob
import os.path as osp
image_list = []
for seq in sorted(glob.glob('work/PaddleDetection/dataset/mot/MOT16/images/train/*')):
for image in glob.glob(osp.join(seq, "img1")+'/*.jpg'):
image = image.replace('work/PaddleDetection/dataset/mot/','')
image_list.append(image)
with open('mot16.train','w') as image_list_file:
image_list_file.write(str.join('\n',image_list))
!mkdir -p work/PaddleDetection/dataset/mot/image_lists && cp -r mot16.train work/PaddleDetection/dataset/mot/image_lists
3. 配置数据集的yml文件
修改配置文件里面的数据集
添加在PaddleDetection/configs/mot/fairmot/fairmot_dla34_30e_864x480.yml文件最后
# for MOT training
TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot # 训练集所在目录
image_lists: ['mot16.train']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
# for MOT evaluation
# If you want to change the MOT evaluation dataset, please modify 'data_root'
EvalMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
data_root: MOT16/images/train
keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
# for MOT video inference
TestMOTDataset:
!MOTImageFolder
dataset_dir: dataset/mot
keep_ori_im: True # set True if save visualization images or video
粘贴配置文档后的yml文件:
第三步:模型训练
使用MOT16作为训练数据,训练30epoch,由于我们使用的是MOT16的全量数据,所以训练时长要比单独摘取MOT16的分支数据集要慢很多,实际演示的过程中大家在选取数据集的时候可以采用MOT16的分支数据集作为演示
!cd work/PaddleDetection/ && python -m paddle.distributed.launch --log_dir=./fairmot_dla34_30e_1088x608/ --gpus 0 tools/train.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml
----------- Configuration Arguments -----------
backend: auto
elastic_server: None
force: False
gpus: 0
heter_worker_num: None
heter_workers:
host: None
http_port: None
ips: 127.0.0.1
job_id: None
log_dir: ./fairmot_dla34_30e_1088x608/
np: None
nproc_per_node: None
run_mode: None
scale: 0
server_num: None
servers:
training_script: tools/train.py
training_script_args: ['-c', 'configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml']
worker_num: None
workers:
------------------------------------------------
WARNING 2021-12-06 17:31:34,347 launch.py:416] Not found distinct arguments and compiled with cuda or xpu. Default use collective mode
launch train in GPU mode!
INFO 2021-12-06 17:31:34,348 launch_utils.py:527] Local start 1 processes. First process distributed environment info (Only For Debug):
+=======================================================================================+
| Distributed Envs Value |
+---------------------------------------------------------------------------------------+
| PADDLE_TRAINER_ID 0 |
| PADDLE_CURRENT_ENDPOINT 127.0.0.1:47105 |
| PADDLE_TRAINERS_NUM 1 |
| PADDLE_TRAINER_ENDPOINTS 127.0.0.1:47105 |
| PADDLE_RANK_IN_NODE 0 |
| PADDLE_LOCAL_DEVICE_IDS 0 |
| PADDLE_WORLD_DEVICE_IDS 0 |
| FLAGS_selected_gpus 0 |
| FLAGS_selected_accelerators 0 |
+=======================================================================================+
INFO 2021-12-06 17:31:34,348 launch_utils.py:531] details abouts PADDLE_TRAINER_ENDPOINTS can be found in ./fairmot_dla34_30e_1088x608//endpoints.log, and detail running logs maybe found in ./fairmot_dla34_30e_1088x608//workerlog.0
launch proc_id:1911 idx:0
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/tensor/creation.py:130: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
if data.dtype == np.object:
[12/06 17:31:38] ppdet.data.source.mot INFO: MOT dataset summary:
[12/06 17:31:38] ppdet.data.source.mot INFO: OrderedDict([('mot16.train', 518)])
[12/06 17:31:38] ppdet.data.source.mot INFO: Total images: 5316
[12/06 17:31:38] ppdet.data.source.mot INFO: Image start index: OrderedDict([('mot16.train', 0)])
[12/06 17:31:38] ppdet.data.source.mot INFO: Total identities: 519
[12/06 17:31:38] ppdet.data.source.mot INFO: Identity start index: OrderedDict([('mot16.train', 0)])
W1206 17:31:41.335633 1911 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W1206 17:31:41.340693 1911 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[12/06 17:31:47] ppdet.utils.download INFO: Downloading fairmot_dla34_crowdhuman_pretrained.pdparams from https://paddledet.bj.bcebos.com/models/pretrained/fairmot_dla34_crowdhuman_pretrained.pdparams
0%| | 0/127930 [00:00<?, ?KB/s]
0%| | 243/127930 [00:00<00:53, 2405.10KB/s]
0%| | 579/127930 [00:00<00:48, 2621.31KB/s]
1%| | 1059/127930 [00:00<00:42, 3019.88KB/s]
1%|▏ | 1763/127930 [00:00<00:34, 3642.71KB/s]
2%|▏ | 2771/127930 [00:00<00:27, 4504.33KB/s]
3%|▎ | 4326/127930 [00:00<00:21, 5724.09KB/s]
5%|▍ | 6307/127930 [00:00<00:16, 7274.25KB/s]
7%|▋ | 9171/127930 [00:00<00:12, 9369.63KB/s]
10%|█ | 13049/127930 [00:00<00:09, 12129.23KB/s]
14%|█▍ | 18401/127930 [00:01<00:06, 15793.38KB/s]
20%|█▉ | 25169/127930 [00:01<00:05, 20510.70KB/s]
25%|██▌ | 32037/127930 [00:01<00:03, 25976.20KB/s]
30%|███ | 38562/127930 [00:01<00:02, 31699.98KB/s]
35%|███▌ | 45260/127930 [00:01<00:02, 37648.87KB/s]
41%|████ | 51946/127930 [00:01<00:01, 43327.44KB/s]
46%|████▌ | 58601/127930 [00:01<00:01, 48392.65KB/s]
51%|█████ | 65431/127930 [00:01<00:01, 53028.52KB/s]
57%|█████▋ | 72536/127930 [00:01<00:00, 57395.43KB/s]
62%|██████▏ | 79580/127930 [00:01<00:00, 60771.23KB/s]
68%|██████▊ | 86643/127930 [00:02<00:00, 63422.05KB/s]
73%|███████▎ | 93691/127930 [00:02<00:00, 65384.56KB/s]
79%|███████▊ | 100603/127930 [00:02<00:00, 66461.71KB/s]
84%|████████▍ | 107606/127930 [00:02<00:00, 67492.09KB/s]
90%|████████▉ | 114706/127930 [00:02<00:00, 68507.21KB/s]
95%|█████████▌| 121746/127930 [00:02<00:00, 69063.90KB/s]
100%|██████████| 127930/127930 [00:02<00:00, 49433.09KB/s]
[12/06 17:31:50] ppdet.utils.checkpoint INFO: The shape [14455] in pretrained weight reid.classifier.bias is unmatched with the shape [519] in model reid.classifier.bias. And the weight reid.classifier.bias will not be loaded
[12/06 17:31:50] ppdet.utils.checkpoint INFO: The shape [128, 14455] in pretrained weight reid.classifier.weight is unmatched with the shape [128, 519] in model reid.classifier.weight. And the weight reid.classifier.weight will not be loaded
[12/06 17:31:50] ppdet.utils.checkpoint INFO: Finish loading model weights: /home/aistudio/.cache/paddle/weights/fairmot_dla34_crowdhuman_pretrained.pdparams
[12/06 17:31:51] ppdet.engine INFO: Epoch: [0] [ 0/886] learning_rate: 0.000100 loss: 11.543140 heatmap_loss: 0.707743 size_loss: 0.841231 offset_loss: 0.199213 det_loss: 0.991079 reid_loss: 6.887893 eta: 4:44:04 batch_cost: 0.6412 data_cost: 0.0004 ips: 9.3568 images/s
[12/06 17:32:03] ppdet.engine INFO: Epoch: [0] [ 20/886] learning_rate: 0.000100 loss: 11.371736 heatmap_loss: 0.790844 size_loss: 1.153176 offset_loss: 0.207417 det_loss: 1.129737 reid_loss: 6.450063 eta: 4:35:46 batch_cost: 0.6221 data_cost: 0.0003 ips: 9.6453 images/s
[12/06 17:32:16] ppdet.engine INFO: Epoch: [0] [ 40/886] learning_rate: 0.000100 loss: 10.481521 heatmap_loss: 0.754798 size_loss: 0.956801 offset_loss: 0.211136 det_loss: 1.079346 reid_loss: 6.054442 eta: 4:36:08 batch_cost: 0.6257 data_cost: 0.0003 ips: 9.5895 images/s
[12/06 17:32:28] ppdet.engine INFO: Epoch: [0] [ 60/886]
learning_rate: 0.000100 loss: 7.048557 heatmap_loss: 0.578546 size_loss: 0.739798 offset_loss: 0.202264 det_loss: 0.905977 reid_loss: 4.017921 eta: 4:44:41 batch_cost: 0.6728 data_cost: 0.0003 ips: 8.9184 images/s
第四步:模型评估
开启下面的预测,最终我们得到如下图所示的内容,就是我们的模型最终预测结果
!cd work/PaddleDetection && CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=output/fairmot_dla34_30e_1088x608/model_final.pdparams
输出结果较长,我们只展示最后的输出结果如下:
第五步:模型预测
使用下载好的模型进行推理,为了方便我们只推理了dataset/mot/MOT16/images/test/MOT16-01/img1
下面的数据
跟踪输出视频保存在output/mot_outputs/img1_vis.mp4
txt文件结果保存在output/mot_results/img1.txt
,输出格式表示为frame_id, id, bbox_left, bbox_top, bbox_width, bbox_height, score, x, y, z
!cd work/PaddleDetection/ && CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=output/fairmot_dla34_30e_1088x608/model_final.pdparams --image_dir=dataset/mot/MOT16/images/test/MOT16-01/img1 --save_videos
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/tensor/creation.py:130: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
if data.dtype == np.object:
W1206 22:39:30.301265 15564 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W1206 22:39:30.306080 15564 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[12/06 22:39:32] ppdet.utils.checkpoint INFO: Finish resuming model weights: output/fairmot_dla34_30e_1088x608/model_final.pdparams
[12/06 22:39:32] ppdet.engine.tracker INFO: Starting tracking folder dataset/mot/MOT16/images/test/MOT16-01/img1, found 450 images
[12/06 22:39:32] ppdet.engine.tracker INFO: Processing frame 0 (100000.00 fps)
[12/06 22:39:36] ppdet.engine.tracker INFO: Processing frame 40 (19.35 fps)
[12/06 22:39:39] ppdet.engine.tracker INFO: Processing frame 80 (19.52 fps)
[12/06 22:39:43] ppdet.engine.tracker INFO: Processing frame 120 (19.66 fps)
[12/06 22:39:46] ppdet.engine.tracker INFO: Processing frame 160 (19.72 fps)
[12/06 22:39:50] ppdet.engine.tracker INFO: Processing frame 200 (19.77 fps)
[12/06 22:39:54] ppdet.engine.tracker INFO: Processing frame 240 (18.92 fps)
[12/06 22:39:58] ppdet.engine.tracker INFO: Processing frame 280 (18.88 fps)
[12/06 22:40:01] ppdet.engine.tracker INFO: Processing frame 320 (18.99 fps)
[12/06 22:40:05] ppdet.engine.tracker INFO: Processing frame 360 (19.10 fps)
[12/06 22:40:08] ppdet.engine.tracker INFO: Processing frame 400 (19.19 fps)
[12/06 22:40:12] ppdet.engine.tracker INFO: Processing frame 440 (19.29 fps)
ffmpeg version 2.8.15-0ubuntu0.16.04.1 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.10) 20160609
configuration: --prefix=/usr --extra-version=0ubuntu0.16.04.1 --build-suffix=-ffmpeg --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --cc=cc --cxx=g++ --enable-gpl --enable-shared --disable-stripping --disable-decoder=libopenjpeg --disable-decoder=libschroedinger --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzvbi --enable-openal --enable-opengl --enable-x11grab --enable-libdc1394 --enable-libiec61883 --enable-libzmq --enable-frei0r --enable-libx264 --enable-libopencv
libavutil 54. 31.100 / 54. 31.100
libavcodec 56. 60.100 / 56. 60.100
libavformat 56. 40.101 / 56. 40.101
libavdevice 56. 4.100 / 56. 4.100
libavfilter 5. 40.101 / 5. 40.101
libavresample 2. 1. 0 / 2. 1. 0
libswscale 3. 1.101 / 3. 1.101
libswresample 1. 2.101 / 1. 2.101
libpostproc 53. 3.100 / 53. 3.100
[0;36m[mjpeg @ 0x1485720] [0mChangeing bps to 8
Input #0, image2, from 'output/mot_outputs/img1/%05d.jpg':
Duration: 00:00:18.00, start: 0.000000, bitrate: N/A
Stream #0:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 25 tbn, 25 tbc
[0;33mNo pixel format specified, yuvj420p for H.264 encoding chosen.
Use -pix_fmt yuv420p for compatibility with outdated media players.
[0m[1;36m[libx264 @ 0x14883e0] [0musing SAR=1/1
[1;36m[libx264 @ 0x14883e0] [0musing cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 AVX2 LZCNT BMI2
[1;36m[libx264 @ 0x14883e0] [0mprofile High, level 4.0
[1;36m[libx264 @ 0x14883e0] [0m264 - core 148 r2643 5c65704 - H.264/MPEG-4 AVC codec - Copyleft 2003-2015 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=34 lookahead_threads=5 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output/mot_outputs/img1/../img1_vis.mp4':
Metadata:
encoder : Lavf56.40.101
Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuvj420p(pc), 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc
Metadata:
encoder : Lavc56.60.100 libx264
Stream mapping:
Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
Press [q] to stop, [?] for help
frame= 450 fps=8.1 q=-1.0 Lsize= 18529kB time=00:00:17.92 bitrate=8470.4kbits/s
video:18523kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.034153%
[1;36m[libx264 @ 0x14883e0] [0mframe I:2 Avg QP:20.49 size:150857
[1;36m[libx264 @ 0x14883e0] [0mframe P:177 Avg QP:22.36 size: 78490
[1;36m[libx264 @ 0x14883e0] [0mframe B:271 Avg QP:26.49 size: 17610
[1;36m[libx264 @ 0x14883e0] [0mconsecutive B-frames: 3.3% 31.6% 52.7% 12.4%
[1;36m[libx264 @ 0x14883e0] [0mmb I I16..4: 14.4% 80.9% 4.7%
[1;36m[libx264 @ 0x14883e0] [0mmb P I16..4: 3.9% 27.3% 1.6% P16..4: 20.9% 15.6% 14.6% 0.0% 0.0% skip:16.0%
[1;36m[libx264 @ 0x14883e0] [0mmb B I16..4: 1.3% 6.5% 0.2% B16..8: 32.7% 6.9% 1.8% direct: 2.1% skip:48.5% L0:50.4% L1:40.5% BI: 9.1%
[1;36m[libx264 @ 0x14883e0] [0m8x8 transform intra:82.5% inter:87.0%
[1;36m[libx264 @ 0x14883e0] [0mcoded y,uvDC,uvAC intra: 62.4% 60.5% 3.6% inter: 20.4% 13.9% 2.5%
[1;36m[libx264 @ 0x14883e0] [0mi16 v,h,dc,p: 35% 29% 35% 1%
[1;36m[libx264 @ 0x14883e0] [0mi8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 18% 24% 49% 2% 1% 1% 1% 1% 3%
[1;36m[libx264 @ 0x14883e0] [0mi4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 44% 24% 12% 3% 3% 3% 4% 3% 3%
[1;36m[libx264 @ 0x14883e0] [0mi8c dc,h,v,p: 37% 29% 32% 2%
[1;36m[libx264 @ 0x14883e0] [0mWeighted P-Frames: Y:0.0% UV:0.0%
[1;36m[libx264 @ 0x14883e0] [0mref P L0: 53.1% 13.1% 20.0% 13.8%
[1;36m[libx264 @ 0x14883e0] [0mref B L0: 71.1% 22.9% 6.0%
[1;36m[libx264 @ 0x14883e0] [0mref B L1: 84.8% 15.2%
[1;36m[libx264 @ 0x14883e0] [0mkb/s:8429.62
[12/06 22:41:08] ppdet.engine.tracker INFO: Save video in output/mot_outputs/img1/../img1_vis.mp4
MOT results save in output/mot_results/img1.txt
预测结束后,我们可以将生成的mp4格式的视频在本地打,实时查看行人追踪情况,结果如下面截图类似
更多推荐
所有评论(0)