Windows:Paddledetection的C++预测部署
以C++为部署语言,windows为部署平台,使用CPU版本的paddleinference推理库,且最终生成.exe可执行文件进行推理部署。
0. 前言
当目标检测模型训练完毕之后,需要进入到部署环节。在AI应用中,常见的部署方式有本地部署,云端部署和边侧部署等。本节主要讲解下本地部署,具体是在windows平台上使用C++语言,CPU版本的paddleinference推理库,并最终生成.exe可执行文件的方式完成部署任务。
1. 前置条件
- windows10
- visual studio 2019 社区版
- CMake 3.0+ (下载地址)
- python3.7
- opencv3.4.6
- paddleinference2.3.0 (CPU版本)
补充:由于这里使用CPU进行模型推理,故不需要配置cuda,cudnn和tensorrt。
2. 下载推理代码
为了增加模型开发和部署的效率,这里推荐使用Paddledetection目标检测工具箱。其内置了模型训练和部署功能,满足多种场景的需求。
git clone https://github.com/PaddlePaddle/PaddleDetection
下载完毕之后,切换到deploy/C++目录中,该文件夹包含本次任务需要的C++部署代码。由于该目录不依赖任何PaddleDetection下的其他目录。故这里将delpoy/C++目录单独提出来,可重新命名(这里命名为swineCountingCPU),并放在工程目录下(这里为E:\Program)。
3. 下载paddlepaddle的C++预测库paddle_inference
PaddleDetection C++预测库针对不同的CPU和CUDA版本提供了不同的预编译版本,这里下载CPU版本,C++预测库下载列表
为方便管理,解压后的推理库重命名为paddle_inference2.3_cpu并放入E:\Program\tool文件夹下。
paddle_inference包含的文件夹及对应含义如下所示:
paddle_inference2.3_cpu
|--paddle # paddle核心库和头文件
|--third_party # 第三方依赖库和头文件
|--version.txt # 版本和编译信息
4. 安装配置OpenCV
主要步骤如下:
-
在OpenCV官网下载适用于windows平台的3.4.6版本,下载地址
-
运行下载的可执行文件(.exe),将OpenCV解压至指定目录,这里解压到E:\Program\tool文件夹下
-
配置环境变量,如下流程所示(如果使用全局环境变量,可以不用设置环境变量)
- 我的电脑->属性->高级系统设置->环境变量
- 在系统变量中找到Path(如没有,自行创建),并双击编辑
- 新建,将opencv路径填入并保存,如E:\Program\tools\opencv\build\x64\vc15\bin
最终部署目录的组织形式为:
|--E:\Program
|--swineCountingCPU # 为PaddleDet2.4-swineCounting/deploy/cpp 目录
|--cmake
|--docs
|--include
|--scripts
|--src
|--CMakeLists.txt
|--tools
|--opencv # 版本为3.4.6
|--paddle_inference2.3_cpu
5. 编译
5.1 进入E:\Program\swineCountingCPU文件夹
cd E:\Program\swineCountingCPU
5.2 使用CMake生成项目文件
cmake有两种方法生成项目文件:使用cmake-gui,使用camke命令。这里使用cmake-gui生成项目文件,并保存在out目录下。
使用:
1)在windows上打开cmake-gui,在where is the source code栏输入推理文件夹路径,这里为E:\Program/swineCountingCPU;
2)在where to build the binaries栏下输入生成项目的路径,这里为E:\Program/swineCountingCPU/out;
3)点击Configure,选择vs2019 X64选项后,点击Generate;
4) 会发现报错,莫慌。需要添加OPENCV_DIR, PADDLE_DIR, PADDLE_LIB_NAME路径;
5)当添加完毕之后,再次点击Generate,则会生成sln解决方案。
编译参数的含义说明如下(带*表示仅在使用GPU版本预测库时指定, 其中CUDA库版本尽量对齐,使用9.0、10.0版本,不使用9.2、10.1等版本CUDA库):
参数名 | 含义 |
---|---|
*CUDA_LIB | CUDA的库路径 |
*CUDNN_LIB | CUDNN的库路径 |
OPENCV_DIR | opencv的安装路径 |
PADDLE_DIR | Paddle预测库的路径 |
PADDLE_LIB_NAME | Paddle 预测库名称 |
注意:
使用CPU版预测库,请把WITH_GPU的勾去掉
如果使用的是openblas版本,请把WITH_MKL勾去掉
如无需使用关键点模型可以把WITH_KEYPOINT勾去掉
5.3 编译
当生成sln解决方案后,用visual studio2019打开out文件夹下的PaddleObjectDetector.sln,将编译模式设置为Release,点击生成->重新生成解决方案
或者:
控制台输出如下:
3>正在生成代码...
3>main.vcxproj -> E:\Program\swineCounting_inference\PaddleDet2.4-swineCounting\deploy\cpp\Release\main.exe
3>已完成生成项目“main.vcxproj”的操作。
4>------ 已启动生成: 项目: ALL_BUILD, 配置: Release x64 ------
4>Building Custom Rule E:/Program/swineCounting_inference/PaddleDet2.4-swineCounting/deploy/cpp/CMakeLists.txt
========== 生成: 成功 4 个,失败 0 个,最新 0 个,跳过 0 个 ==========
5.4 预测及可视化
当生成.exe解决方案之后,将E:\Program\tools\paddle_inference2.3_cpu\third_party\install\paddle2onnx\lib里的paddle2onnx.dll复制到out\Release目录下,将E:\Program\tools\paddle_inference2.3_cpu\third_party\install\onnxruntime\lib里的onnxruntime.dll复制到out\Release目录下,(可选:将导出的模型文件夹model和测试图片1.jpg放入Release目录下)然后进入Release文件夹
注:
a. model文件夹为训练并导出的模型,文件结构为:
model.pdiparams
model.pdiparams.info
model.pdmodel
infer_cfg.yml
b. output为输出文件夹
c. 1.jpg为待推理的图片
执行预测:
1)单张图片预测,不开启mkldnn
.\main --model_dir=model \
--image_file=1.jpg
输出如下:
E:\Program\swineCountingCPU\out\Release>.\main --model_dir=model \ --image_file=1.jpg
total images = 1, batch_size = 1, total steps = 1
class=0 confidence=0.8868 rect=[608 842 1034 1011]
class=0 confidence=0.8771 rect=[1499 668 1616 883]
class=0 confidence=0.8517 rect=[363 464 489 671]
class=0 confidence=0.7817 rect=[349 702 605 792]
class=0 confidence=0.7756 rect=[215 405 355 641]
class=0 confidence=0.7470 rect=[175 632 242 768]
class=0 confidence=0.7394 rect=[839 714 1279 812]
class=0 confidence=0.6689 rect=[1266 148 1438 242]
class=0 confidence=0.6608 rect=[1400 449 1498 615]
class=0 confidence=0.6606 rect=[521 190 634 295]
class=0 confidence=0.6350 rect=[1406 282 1521 410]
class=0 confidence=0.5884 rect=[425 795 640 964]
class=0 confidence=0.5517 rect=[452 196 524 295]
class=0 confidence=0.5035 rect=[1607 741 1674 912]
1.jpg The number of detected box: 14
Visualized output saved as output\1.jpg
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0623 15:38:42.337302 17188 main.cc:76] ----------------------- Config info -----------------------
I0623 15:38:42.337302 17188 main.cc:77] runtime_device: CPU
I0623 15:38:42.338304 17188 main.cc:78] ir_optim: True
I0623 15:38:42.339303 17188 main.cc:80] enable_memory_optim: True
I0623 15:38:42.339303 17188 main.cc:89] enable_tensorrt: False
I0623 15:38:42.340296 17188 main.cc:91] precision: fp32
I0623 15:38:42.341290 17188 main.cc:94] enable_mkldnn: False
I0623 15:38:42.342288 17188 main.cc:95] cpu_math_library_num_threads: 1
I0623 15:38:42.343286 17188 main.cc:96] ----------------------- Data info -----------------------
I0623 15:38:42.344286 17188 main.cc:97] batch_size: 1
I0623 15:38:42.346278 17188 main.cc:98] input_shape: dynamic shape
I0623 15:38:42.350271 17188 main.cc:100] ----------------------- Model info -----------------------
I0623 15:38:42.350271 17188 main.cc:102] model_name: model
I0623 15:38:42.355252 17188 main.cc:104] ----------------------- Perf info ------------------------
I0623 15:38:42.356251 17188 main.cc:105] Total number of predicted data: 1 and total time spent(ms): 839
I0623 15:38:42.356251 17188 main.cc:108] preproce_time(ms): 25.7308, inference_time(ms): 814.418, postprocess_time(ms): 0.0381
2)单张图片预测,开启mkldnn
.\main --model_dir=model \
--image_file=1.jpg \
--use_mkldnn=True
输出如下:
E:\Program\swineCountingCPU\out\Release>.\main --model_dir=model \ --image_file=1.jpg \ --use_mkldnn=True
e[37m--- fused 0 elementwise_add with relu activatione[0m
e[37m--- fused 0 elementwise_add with tanh activatione[0m
e[37m--- fused 0 elementwise_add with leaky_relu activatione[0m
e[37m--- fused 0 elementwise_add with swish activatione[0m
e[37m--- fused 0 elementwise_add with hardswish activatione[0m
e[37m--- fused 0 elementwise_add with sqrt activatione[0m
e[37m--- fused 0 elementwise_add with abs activatione[0m
e[37m--- fused 0 elementwise_add with clip activatione[0m
e[37m--- fused 0 elementwise_add with gelu activatione[0m
e[37m--- fused 0 elementwise_add with relu6 activatione[0m
e[37m--- fused 0 elementwise_add with sigmoid activatione[0m
e[37m--- fused 0 elementwise_sub with relu activatione[0m
e[37m--- fused 0 elementwise_sub with tanh activatione[0m
e[37m--- fused 0 elementwise_sub with leaky_relu activatione[0m
e[37m--- fused 0 elementwise_sub with swish activatione[0m
e[37m--- fused 0 elementwise_sub with hardswish activatione[0m
e[37m--- fused 0 elementwise_sub with sqrt activatione[0m
e[37m--- fused 0 elementwise_sub with abs activatione[0m
e[37m--- fused 0 elementwise_sub with clip activatione[0m
e[37m--- fused 0 elementwise_sub with gelu activatione[0m
e[37m--- fused 0 elementwise_sub with relu6 activatione[0m
e[37m--- fused 0 elementwise_sub with sigmoid activatione[0m
e[37m--- fused 0 elementwise_mul with relu activatione[0m
e[37m--- fused 0 elementwise_mul with tanh activatione[0m
e[37m--- fused 0 elementwise_mul with leaky_relu activatione[0m
e[37m--- fused 0 elementwise_mul with swish activatione[0m
e[37m--- fused 0 elementwise_mul with hardswish activatione[0m
e[37m--- fused 0 elementwise_mul with sqrt activatione[0m
e[37m--- fused 0 elementwise_mul with abs activatione[0m
e[37m--- fused 0 elementwise_mul with clip activatione[0m
e[37m--- fused 0 elementwise_mul with gelu activatione[0m
e[37m--- fused 0 elementwise_mul with relu6 activatione[0m
e[37m--- fused 0 elementwise_mul with sigmoid activatione[0m
total images = 1, batch_size = 1, total steps = 1
class=0 confidence=0.8868 rect=[608 842 1034 1011]
class=0 confidence=0.8771 rect=[1499 668 1616 883]
class=0 confidence=0.8517 rect=[363 464 489 671]
class=0 confidence=0.7817 rect=[349 702 605 792]
class=0 confidence=0.7756 rect=[215 405 355 641]
class=0 confidence=0.7470 rect=[175 632 242 768]
class=0 confidence=0.7394 rect=[839 714 1279 812]
class=0 confidence=0.6689 rect=[1266 148 1438 242]
class=0 confidence=0.6608 rect=[1400 449 1498 615]
class=0 confidence=0.6606 rect=[521 190 634 295]
class=0 confidence=0.6350 rect=[1406 282 1521 410]
class=0 confidence=0.5884 rect=[425 795 640 964]
class=0 confidence=0.5517 rect=[452 196 524 295]
class=0 confidence=0.5035 rect=[1607 741 1674 912]
1.jpg The number of detected box: 14
Visualized output saved as output\1.jpg
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0623 15:34:10.344063 5212 main.cc:76] ----------------------- Config info -----------------------
I0623 15:34:10.349049 5212 main.cc:77] runtime_device: CPU
I0623 15:34:10.349049 5212 main.cc:78] ir_optim: True
I0623 15:34:10.350046 5212 main.cc:80] enable_memory_optim: True
I0623 15:34:10.351044 5212 main.cc:89] enable_tensorrt: False
I0623 15:34:10.351044 5212 main.cc:91] precision: fp32
I0623 15:34:10.352041 5212 main.cc:94] enable_mkldnn: True
I0623 15:34:10.352041 5212 main.cc:95] cpu_math_library_num_threads: 1
I0623 15:34:10.353039 5212 main.cc:96] ----------------------- Data info -----------------------
I0623 15:34:10.356032 5212 main.cc:97] batch_size: 1
I0623 15:34:10.356032 5212 main.cc:98] input_shape: dynamic shape
I0623 15:34:10.362093 5212 main.cc:100] ----------------------- Model info -----------------------
I0623 15:34:10.363085 5212 main.cc:102] model_name: model
I0623 15:34:10.364081 5212 main.cc:104] ----------------------- Perf info ------------------------
I0623 15:34:10.364081 5212 main.cc:105] Total number of predicted data: 1 and total time spent(ms): 743
I0623 15:34:10.365078 5212 main.cc:108] preproce_time(ms): 27.2847, inference_time(ms): 716.013, postprocess_time(ms): 0.0357
3)单张图片预测,开启mkldnn,cpu线程设为4
.\main --model_dir=model \
--image_file=1.jpg \
--use_mkldnn=True \
--cpu_threads=4
输出如下:
E:\Program\swineCountingCPU\out\Release>.\main --model_dir=model \ --image_file=1.jpg \ --use_mkldnn=True \ --cpu_threads=4
e[37m--- fused 0 elementwise_add with relu activatione[0m
e[37m--- fused 0 elementwise_add with tanh activatione[0m
e[37m--- fused 0 elementwise_add with leaky_relu activatione[0m
e[37m--- fused 0 elementwise_add with swish activatione[0m
e[37m--- fused 0 elementwise_add with hardswish activatione[0m
e[37m--- fused 0 elementwise_add with sqrt activatione[0m
e[37m--- fused 0 elementwise_add with abs activatione[0m
e[37m--- fused 0 elementwise_add with clip activatione[0m
e[37m--- fused 0 elementwise_add with gelu activatione[0m
e[37m--- fused 0 elementwise_add with relu6 activatione[0m
e[37m--- fused 0 elementwise_add with sigmoid activatione[0m
e[37m--- fused 0 elementwise_sub with relu activatione[0m
e[37m--- fused 0 elementwise_sub with tanh activatione[0m
e[37m--- fused 0 elementwise_sub with leaky_relu activatione[0m
e[37m--- fused 0 elementwise_sub with swish activatione[0m
e[37m--- fused 0 elementwise_sub with hardswish activatione[0m
e[37m--- fused 0 elementwise_sub with sqrt activatione[0m
e[37m--- fused 0 elementwise_sub with abs activatione[0m
e[37m--- fused 0 elementwise_sub with clip activatione[0m
e[37m--- fused 0 elementwise_sub with gelu activatione[0m
e[37m--- fused 0 elementwise_sub with relu6 activatione[0m
e[37m--- fused 0 elementwise_sub with sigmoid activatione[0m
e[37m--- fused 0 elementwise_mul with relu activatione[0m
e[37m--- fused 0 elementwise_mul with tanh activatione[0m
e[37m--- fused 0 elementwise_mul with leaky_relu activatione[0m
e[37m--- fused 0 elementwise_mul with swish activatione[0m
e[37m--- fused 0 elementwise_mul with hardswish activatione[0m
e[37m--- fused 0 elementwise_mul with sqrt activatione[0m
e[37m--- fused 0 elementwise_mul with abs activatione[0m
e[37m--- fused 0 elementwise_mul with clip activatione[0m
e[37m--- fused 0 elementwise_mul with gelu activatione[0m
e[37m--- fused 0 elementwise_mul with relu6 activatione[0m
e[37m--- fused 0 elementwise_mul with sigmoid activatione[0m
total images = 1, batch_size = 1, total steps = 1
class=0 confidence=0.8868 rect=[608 842 1034 1011]
class=0 confidence=0.8771 rect=[1499 668 1616 883]
class=0 confidence=0.8517 rect=[363 464 489 671]
class=0 confidence=0.7817 rect=[349 702 605 792]
class=0 confidence=0.7756 rect=[215 405 355 641]
class=0 confidence=0.7470 rect=[175 632 242 768]
class=0 confidence=0.7394 rect=[839 714 1279 812]
class=0 confidence=0.6689 rect=[1266 148 1438 242]
class=0 confidence=0.6608 rect=[1400 449 1498 615]
class=0 confidence=0.6606 rect=[521 190 634 295]
class=0 confidence=0.6350 rect=[1406 282 1521 410]
class=0 confidence=0.5884 rect=[425 795 640 964]
class=0 confidence=0.5517 rect=[452 196 524 295]
class=0 confidence=0.5035 rect=[1607 741 1674 912]
1.jpg The number of detected box: 14
Visualized output saved as output\1.jpg
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0623 15:56:34.208850 5036 main.cc:76] ----------------------- Config info -----------------------
I0623 15:56:34.208850 5036 main.cc:77] runtime_device: CPU
I0623 15:56:34.214835 5036 main.cc:78] ir_optim: True
I0623 15:56:34.214835 5036 main.cc:80] enable_memory_optim: True
I0623 15:56:34.216826 5036 main.cc:89] enable_tensorrt: False
I0623 15:56:34.220818 5036 main.cc:91] precision: fp32
I0623 15:56:34.222811 5036 main.cc:94] enable_mkldnn: True
I0623 15:56:34.226802 5036 main.cc:95] cpu_math_library_num_threads: 4
I0623 15:56:34.229801 5036 main.cc:96] ----------------------- Data info -----------------------
I0623 15:56:34.230789 5036 main.cc:97] batch_size: 1
I0623 15:56:34.230789 5036 main.cc:98] input_shape: dynamic shape
I0623 15:56:34.231786 5036 main.cc:100] ----------------------- Model info -----------------------
I0623 15:56:34.232785 5036 main.cc:102] model_name: model
I0623 15:56:34.236773 5036 main.cc:104] ----------------------- Perf info ------------------------
I0623 15:56:34.237771 5036 main.cc:105] Total number of predicted data: 1 and total time spent(ms): 697
I0623 15:56:34.244756 5036 main.cc:108] preproce_time(ms): 22.1875, inference_time(ms): 675.815, postprocess_time(ms): 0.104
注:如果使用图像文件夹预测,请将–image_file替换为–image_dir,然后接图片文件夹所在路径
对比分析:
序号 | 是否开启mkldnn | cpu线程数 | 推理时间 |
---|---|---|---|
1 | 否 | 1 | preproce_time(ms): 25.7308, inference_time(ms): 814.418, postprocess_time(ms): 0.0381 |
2 | 是 | 1 | preproce_time(ms): 27.2847, inference_time(ms): 716.013, postprocess_time(ms): 0.0357 |
3 | 是 | 4 | preproce_time(ms): 22.1875, inference_time(ms): 675.815, postprocess_time(ms): 0.1042 |
此文仅为搬运,原作链接:https://aistudio.baidu.com/aistudio/projectdetail/4339850?channelType=0&channel=0
更多推荐
所有评论(0)