0. 前言

当目标检测模型训练完毕之后,需要进入到部署环节。在AI应用中,常见的部署方式有本地部署,云端部署和边侧部署等。本节主要讲解下本地部署,具体是在windows平台上使用C++语言,CPU版本的paddleinference推理库,并最终生成.exe可执行文件的方式完成部署任务。

1. 前置条件

  • windows10
  • visual studio 2019 社区版
  • CMake 3.0+ (下载地址)
  • python3.7
  • opencv3.4.6
  • paddleinference2.3.0 (CPU版本)

补充:由于这里使用CPU进行模型推理,故不需要配置cuda,cudnn和tensorrt。

2. 下载推理代码

为了增加模型开发和部署的效率,这里推荐使用Paddledetection目标检测工具箱。其内置了模型训练和部署功能,满足多种场景的需求。

git clone https://github.com/PaddlePaddle/PaddleDetection

下载完毕之后,切换到deploy/C++目录中,该文件夹包含本次任务需要的C++部署代码。由于该目录不依赖任何PaddleDetection下的其他目录。故这里将delpoy/C++目录单独提出来,可重新命名(这里命名为swineCountingCPU),并放在工程目录下(这里为E:\Program)。

3. 下载paddlepaddle的C++预测库paddle_inference

PaddleDetection C++预测库针对不同的CPU和CUDA版本提供了不同的预编译版本,这里下载CPU版本,C++预测库下载列表在这里插入图片描述

为方便管理,解压后的推理库重命名为paddle_inference2.3_cpu并放入E:\Program\tool文件夹下。
paddle_inference包含的文件夹及对应含义如下所示:
在这里插入图片描述

paddle_inference2.3_cpu
|--paddle # paddle核心库和头文件
|--third_party # 第三方依赖库和头文件
|--version.txt # 版本和编译信息

4. 安装配置OpenCV

主要步骤如下:

  1. 在OpenCV官网下载适用于windows平台的3.4.6版本,下载地址

  2. 运行下载的可执行文件(.exe),将OpenCV解压至指定目录,这里解压到E:\Program\tool文件夹下

  3. 配置环境变量,如下流程所示(如果使用全局环境变量,可以不用设置环境变量)

    • 我的电脑->属性->高级系统设置->环境变量
    • 在系统变量中找到Path(如没有,自行创建),并双击编辑
    • 新建,将opencv路径填入并保存,如E:\Program\tools\opencv\build\x64\vc15\bin
      在这里插入图片描述

最终部署目录的组织形式为:

|--E:\Program
    |--swineCountingCPU # 为PaddleDet2.4-swineCounting/deploy/cpp 目录
       |--cmake
       |--docs
       |--include
       |--scripts
       |--src
       |--CMakeLists.txt
    |--tools
       |--opencv # 版本为3.4.6
       |--paddle_inference2.3_cpu

5. 编译

5.1 进入E:\Program\swineCountingCPU文件夹

cd E:\Program\swineCountingCPU

5.2 使用CMake生成项目文件

cmake有两种方法生成项目文件:使用cmake-gui,使用camke命令。这里使用cmake-gui生成项目文件,并保存在out目录下。
在这里插入图片描述

使用:

1)在windows上打开cmake-gui,在where is the source code栏输入推理文件夹路径,这里为E:\Program/swineCountingCPU;
2)在where to build the binaries栏下输入生成项目的路径,这里为E:\Program/swineCountingCPU/out;
3)点击Configure,选择vs2019 X64选项后,点击Generate;
4) 会发现报错,莫慌。需要添加OPENCV_DIR, PADDLE_DIR, PADDLE_LIB_NAME路径;
5)当添加完毕之后,再次点击Generate,则会生成sln解决方案。

编译参数的含义说明如下(带*表示仅在使用GPU版本预测库时指定, 其中CUDA库版本尽量对齐,使用9.0、10.0版本,不使用9.2、10.1等版本CUDA库):

参数名含义
*CUDA_LIBCUDA的库路径
*CUDNN_LIBCUDNN的库路径
OPENCV_DIRopencv的安装路径
PADDLE_DIRPaddle预测库的路径
PADDLE_LIB_NAMEPaddle 预测库名称

注意:
使用CPU版预测库,请把WITH_GPU的勾去掉
如果使用的是openblas版本,请把WITH_MKL勾去掉
如无需使用关键点模型可以把WITH_KEYPOINT勾去掉

5.3 编译

当生成sln解决方案后,用visual studio2019打开out文件夹下的PaddleObjectDetector.sln,将编译模式设置为Release,点击生成->重新生成解决方案
在这里插入图片描述

在这里插入图片描述

或者:
在这里插入图片描述

控制台输出如下:

3>正在生成代码...
3>main.vcxproj -> E:\Program\swineCounting_inference\PaddleDet2.4-swineCounting\deploy\cpp\Release\main.exe
3>已完成生成项目“main.vcxproj”的操作。
4>------ 已启动生成: 项目: ALL_BUILD, 配置: Release x64 ------
4>Building Custom Rule E:/Program/swineCounting_inference/PaddleDet2.4-swineCounting/deploy/cpp/CMakeLists.txt
========== 生成: 成功 4 个,失败 0 个,最新 0 个,跳过 0 个 ==========

5.4 预测及可视化

当生成.exe解决方案之后,将E:\Program\tools\paddle_inference2.3_cpu\third_party\install\paddle2onnx\lib里的paddle2onnx.dll复制到out\Release目录下,将E:\Program\tools\paddle_inference2.3_cpu\third_party\install\onnxruntime\lib里的onnxruntime.dll复制到out\Release目录下,(可选:将导出的模型文件夹model和测试图片1.jpg放入Release目录下)然后进入Release文件夹
在这里插入图片描述

注:

a. model文件夹为训练并导出的模型,文件结构为:

model.pdiparams
model.pdiparams.info
model.pdmodel
infer_cfg.yml

b. output为输出文件夹

c. 1.jpg为待推理的图片

执行预测:

1)单张图片预测,不开启mkldnn

.\main --model_dir=model \
    --image_file=1.jpg

输出如下:

E:\Program\swineCountingCPU\out\Release>.\main --model_dir=model \    --image_file=1.jpg
total images = 1, batch_size = 1, total steps = 1
class=0 confidence=0.8868 rect=[608 842 1034 1011]
class=0 confidence=0.8771 rect=[1499 668 1616 883]
class=0 confidence=0.8517 rect=[363 464 489 671]
class=0 confidence=0.7817 rect=[349 702 605 792]
class=0 confidence=0.7756 rect=[215 405 355 641]
class=0 confidence=0.7470 rect=[175 632 242 768]
class=0 confidence=0.7394 rect=[839 714 1279 812]
class=0 confidence=0.6689 rect=[1266 148 1438 242]
class=0 confidence=0.6608 rect=[1400 449 1498 615]
class=0 confidence=0.6606 rect=[521 190 634 295]
class=0 confidence=0.6350 rect=[1406 282 1521 410]
class=0 confidence=0.5884 rect=[425 795 640 964]
class=0 confidence=0.5517 rect=[452 196 524 295]
class=0 confidence=0.5035 rect=[1607 741 1674 912]
1.jpg The number of detected box: 14
Visualized output saved as output\1.jpg
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0623 15:38:42.337302 17188 main.cc:76] ----------------------- Config info -----------------------
I0623 15:38:42.337302 17188 main.cc:77] runtime_device: CPU
I0623 15:38:42.338304 17188 main.cc:78] ir_optim: True
I0623 15:38:42.339303 17188 main.cc:80] enable_memory_optim: True
I0623 15:38:42.339303 17188 main.cc:89] enable_tensorrt: False
I0623 15:38:42.340296 17188 main.cc:91] precision: fp32
I0623 15:38:42.341290 17188 main.cc:94] enable_mkldnn: False
I0623 15:38:42.342288 17188 main.cc:95] cpu_math_library_num_threads: 1
I0623 15:38:42.343286 17188 main.cc:96] ----------------------- Data info -----------------------
I0623 15:38:42.344286 17188 main.cc:97] batch_size: 1
I0623 15:38:42.346278 17188 main.cc:98] input_shape: dynamic shape
I0623 15:38:42.350271 17188 main.cc:100] ----------------------- Model info -----------------------
I0623 15:38:42.350271 17188 main.cc:102] model_name: model
I0623 15:38:42.355252 17188 main.cc:104] ----------------------- Perf info ------------------------
I0623 15:38:42.356251 17188 main.cc:105] Total number of predicted data: 1 and total time spent(ms): 839
I0623 15:38:42.356251 17188 main.cc:108] preproce_time(ms): 25.7308, inference_time(ms): 814.418, postprocess_time(ms): 0.0381

2)单张图片预测,开启mkldnn

.\main --model_dir=model \
    --image_file=1.jpg \
    --use_mkldnn=True

输出如下:

E:\Program\swineCountingCPU\out\Release>.\main --model_dir=model \    --image_file=1.jpg \    --use_mkldnn=True
e[37m---    fused 0 elementwise_add with relu activatione[0m
e[37m---    fused 0 elementwise_add with tanh activatione[0m
e[37m---    fused 0 elementwise_add with leaky_relu activatione[0m
e[37m---    fused 0 elementwise_add with swish activatione[0m
e[37m---    fused 0 elementwise_add with hardswish activatione[0m
e[37m---    fused 0 elementwise_add with sqrt activatione[0m
e[37m---    fused 0 elementwise_add with abs activatione[0m
e[37m---    fused 0 elementwise_add with clip activatione[0m
e[37m---    fused 0 elementwise_add with gelu activatione[0m
e[37m---    fused 0 elementwise_add with relu6 activatione[0m
e[37m---    fused 0 elementwise_add with sigmoid activatione[0m
e[37m---    fused 0 elementwise_sub with relu activatione[0m
e[37m---    fused 0 elementwise_sub with tanh activatione[0m
e[37m---    fused 0 elementwise_sub with leaky_relu activatione[0m
e[37m---    fused 0 elementwise_sub with swish activatione[0m
e[37m---    fused 0 elementwise_sub with hardswish activatione[0m
e[37m---    fused 0 elementwise_sub with sqrt activatione[0m
e[37m---    fused 0 elementwise_sub with abs activatione[0m
e[37m---    fused 0 elementwise_sub with clip activatione[0m
e[37m---    fused 0 elementwise_sub with gelu activatione[0m
e[37m---    fused 0 elementwise_sub with relu6 activatione[0m
e[37m---    fused 0 elementwise_sub with sigmoid activatione[0m
e[37m---    fused 0 elementwise_mul with relu activatione[0m
e[37m---    fused 0 elementwise_mul with tanh activatione[0m
e[37m---    fused 0 elementwise_mul with leaky_relu activatione[0m
e[37m---    fused 0 elementwise_mul with swish activatione[0m
e[37m---    fused 0 elementwise_mul with hardswish activatione[0m
e[37m---    fused 0 elementwise_mul with sqrt activatione[0m
e[37m---    fused 0 elementwise_mul with abs activatione[0m
e[37m---    fused 0 elementwise_mul with clip activatione[0m
e[37m---    fused 0 elementwise_mul with gelu activatione[0m
e[37m---    fused 0 elementwise_mul with relu6 activatione[0m
e[37m---    fused 0 elementwise_mul with sigmoid activatione[0m
total images = 1, batch_size = 1, total steps = 1
class=0 confidence=0.8868 rect=[608 842 1034 1011]
class=0 confidence=0.8771 rect=[1499 668 1616 883]
class=0 confidence=0.8517 rect=[363 464 489 671]
class=0 confidence=0.7817 rect=[349 702 605 792]
class=0 confidence=0.7756 rect=[215 405 355 641]
class=0 confidence=0.7470 rect=[175 632 242 768]
class=0 confidence=0.7394 rect=[839 714 1279 812]
class=0 confidence=0.6689 rect=[1266 148 1438 242]
class=0 confidence=0.6608 rect=[1400 449 1498 615]
class=0 confidence=0.6606 rect=[521 190 634 295]
class=0 confidence=0.6350 rect=[1406 282 1521 410]
class=0 confidence=0.5884 rect=[425 795 640 964]
class=0 confidence=0.5517 rect=[452 196 524 295]
class=0 confidence=0.5035 rect=[1607 741 1674 912]
1.jpg The number of detected box: 14
Visualized output saved as output\1.jpg
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0623 15:34:10.344063  5212 main.cc:76] ----------------------- Config info -----------------------
I0623 15:34:10.349049  5212 main.cc:77] runtime_device: CPU
I0623 15:34:10.349049  5212 main.cc:78] ir_optim: True
I0623 15:34:10.350046  5212 main.cc:80] enable_memory_optim: True
I0623 15:34:10.351044  5212 main.cc:89] enable_tensorrt: False
I0623 15:34:10.351044  5212 main.cc:91] precision: fp32
I0623 15:34:10.352041  5212 main.cc:94] enable_mkldnn: True
I0623 15:34:10.352041  5212 main.cc:95] cpu_math_library_num_threads: 1
I0623 15:34:10.353039  5212 main.cc:96] ----------------------- Data info -----------------------
I0623 15:34:10.356032  5212 main.cc:97] batch_size: 1
I0623 15:34:10.356032  5212 main.cc:98] input_shape: dynamic shape
I0623 15:34:10.362093  5212 main.cc:100] ----------------------- Model info -----------------------
I0623 15:34:10.363085  5212 main.cc:102] model_name: model
I0623 15:34:10.364081  5212 main.cc:104] ----------------------- Perf info ------------------------
I0623 15:34:10.364081  5212 main.cc:105] Total number of predicted data: 1 and total time spent(ms): 743
I0623 15:34:10.365078  5212 main.cc:108] preproce_time(ms): 27.2847, inference_time(ms): 716.013, postprocess_time(ms): 0.0357

3)单张图片预测,开启mkldnn,cpu线程设为4

.\main --model_dir=model \
    --image_file=1.jpg \
    --use_mkldnn=True \
    --cpu_threads=4

输出如下:

E:\Program\swineCountingCPU\out\Release>.\main --model_dir=model \    --image_file=1.jpg \    --use_mkldnn=True \    --cpu_threads=4
e[37m---    fused 0 elementwise_add with relu activatione[0m
e[37m---    fused 0 elementwise_add with tanh activatione[0m
e[37m---    fused 0 elementwise_add with leaky_relu activatione[0m
e[37m---    fused 0 elementwise_add with swish activatione[0m
e[37m---    fused 0 elementwise_add with hardswish activatione[0m
e[37m---    fused 0 elementwise_add with sqrt activatione[0m
e[37m---    fused 0 elementwise_add with abs activatione[0m
e[37m---    fused 0 elementwise_add with clip activatione[0m
e[37m---    fused 0 elementwise_add with gelu activatione[0m
e[37m---    fused 0 elementwise_add with relu6 activatione[0m
e[37m---    fused 0 elementwise_add with sigmoid activatione[0m
e[37m---    fused 0 elementwise_sub with relu activatione[0m
e[37m---    fused 0 elementwise_sub with tanh activatione[0m
e[37m---    fused 0 elementwise_sub with leaky_relu activatione[0m
e[37m---    fused 0 elementwise_sub with swish activatione[0m
e[37m---    fused 0 elementwise_sub with hardswish activatione[0m
e[37m---    fused 0 elementwise_sub with sqrt activatione[0m
e[37m---    fused 0 elementwise_sub with abs activatione[0m
e[37m---    fused 0 elementwise_sub with clip activatione[0m
e[37m---    fused 0 elementwise_sub with gelu activatione[0m
e[37m---    fused 0 elementwise_sub with relu6 activatione[0m
e[37m---    fused 0 elementwise_sub with sigmoid activatione[0m
e[37m---    fused 0 elementwise_mul with relu activatione[0m
e[37m---    fused 0 elementwise_mul with tanh activatione[0m
e[37m---    fused 0 elementwise_mul with leaky_relu activatione[0m
e[37m---    fused 0 elementwise_mul with swish activatione[0m
e[37m---    fused 0 elementwise_mul with hardswish activatione[0m
e[37m---    fused 0 elementwise_mul with sqrt activatione[0m
e[37m---    fused 0 elementwise_mul with abs activatione[0m
e[37m---    fused 0 elementwise_mul with clip activatione[0m
e[37m---    fused 0 elementwise_mul with gelu activatione[0m
e[37m---    fused 0 elementwise_mul with relu6 activatione[0m
e[37m---    fused 0 elementwise_mul with sigmoid activatione[0m
total images = 1, batch_size = 1, total steps = 1
class=0 confidence=0.8868 rect=[608 842 1034 1011]
class=0 confidence=0.8771 rect=[1499 668 1616 883]
class=0 confidence=0.8517 rect=[363 464 489 671]
class=0 confidence=0.7817 rect=[349 702 605 792]
class=0 confidence=0.7756 rect=[215 405 355 641]
class=0 confidence=0.7470 rect=[175 632 242 768]
class=0 confidence=0.7394 rect=[839 714 1279 812]
class=0 confidence=0.6689 rect=[1266 148 1438 242]
class=0 confidence=0.6608 rect=[1400 449 1498 615]
class=0 confidence=0.6606 rect=[521 190 634 295]
class=0 confidence=0.6350 rect=[1406 282 1521 410]
class=0 confidence=0.5884 rect=[425 795 640 964]
class=0 confidence=0.5517 rect=[452 196 524 295]
class=0 confidence=0.5035 rect=[1607 741 1674 912]
1.jpg The number of detected box: 14
Visualized output saved as output\1.jpg
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0623 15:56:34.208850  5036 main.cc:76] ----------------------- Config info -----------------------
I0623 15:56:34.208850  5036 main.cc:77] runtime_device: CPU
I0623 15:56:34.214835  5036 main.cc:78] ir_optim: True
I0623 15:56:34.214835  5036 main.cc:80] enable_memory_optim: True
I0623 15:56:34.216826  5036 main.cc:89] enable_tensorrt: False
I0623 15:56:34.220818  5036 main.cc:91] precision: fp32
I0623 15:56:34.222811  5036 main.cc:94] enable_mkldnn: True
I0623 15:56:34.226802  5036 main.cc:95] cpu_math_library_num_threads: 4
I0623 15:56:34.229801  5036 main.cc:96] ----------------------- Data info -----------------------
I0623 15:56:34.230789  5036 main.cc:97] batch_size: 1
I0623 15:56:34.230789  5036 main.cc:98] input_shape: dynamic shape
I0623 15:56:34.231786  5036 main.cc:100] ----------------------- Model info -----------------------
I0623 15:56:34.232785  5036 main.cc:102] model_name: model
I0623 15:56:34.236773  5036 main.cc:104] ----------------------- Perf info ------------------------
I0623 15:56:34.237771  5036 main.cc:105] Total number of predicted data: 1 and total time spent(ms): 697
I0623 15:56:34.244756  5036 main.cc:108] preproce_time(ms): 22.1875, inference_time(ms): 675.815, postprocess_time(ms): 0.104

注:如果使用图像文件夹预测,请将–image_file替换为–image_dir,然后接图片文件夹所在路径

对比分析:

序号是否开启mkldnncpu线程数推理时间
11preproce_time(ms): 25.7308, inference_time(ms): 814.418, postprocess_time(ms): 0.0381
21preproce_time(ms): 27.2847, inference_time(ms): 716.013, postprocess_time(ms): 0.0357
34preproce_time(ms): 22.1875, inference_time(ms): 675.815, postprocess_time(ms): 0.1042

此文仅为搬运,原作链接:https://aistudio.baidu.com/aistudio/projectdetail/4339850?channelType=0&channel=0

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐