飞桨常规赛:中文场景文字识别 - 12月第3名方案
转自AI Studio,原文链接:飞桨常规赛:中文场景文字识别 - 12月第3名方案 - 飞桨AI Studio常规赛:2021年12月中文场景文字识别-第3名技术方案分享本项目为常规赛:中文场景文字识别2021年12月份第3名的技术方案分享项目。最终得分为84.22158。本项目使用PaddleOCR-develop(静态图版本),PaddleOCR主要由DB文本检测、检测框矫正和CRNN文本识
转自AI Studio,原文链接:飞桨常规赛:中文场景文字识别 - 12月第3名方案 - 飞桨AI Studio
常规赛:2021年12月中文场景文字识别-第3名技术方案分享
-
本项目为常规赛:中文场景文字识别 2021年12月份第3名的技术方案分享项目。最终得分为84.22158。
-
本项目使用PaddleOCR-develop(静态图版本),PaddleOCR主要由DB文本检测、检测框矫正和CRNN文本识别三部分组成,本次中文场景文本识别只需要使用第三阶段的文本识别器即可。采用CRNN文本识别模型作为baseline,更多关于PaddleOCR的信息详见PaddleOCR-develop
-
下面将从环境安装、数据处理、模型调优、训练与预测四个方面进行介绍。
引言
(1) 简介
针对于一些场景下的文字识别问题,考虑使用PaddleOCR对其进行解决。PaddleOCR是一个文字识别模型套件,很适用于该问题的解决。本项目首先使用git安装相应的PaddleOCR环境,然后,进行数据处理操作,包含解压数据集、数据准备以及图像数据增强等等。进而,进行模型参数配置,训练模型,导出模型,并预测结果。然后,使用数据结构算法对结果预测txt文件按照比赛提交要求进行调整排序。最终,对本项目的经验做出总结并展望接下来的改进方向,为其他选手提供参考建议。
注:本项目中有一些步骤需要手动添加配置(例如把本项目提供的代码复制到相应的配置文件中)
(2) 比赛介绍
- 中文场景文字识别技术在人们的日常生活中受到广泛关注,具有丰富的应用场景,如:拍照翻译、图像检索、场景理解等。然而,中文场景中的文字面临着包括光照变化、低分辨率、字体以及排布多样性、中文字符种类多等复杂情况。如何解决上述问题成为一项极具挑战性的任务。
- 中文场景文字识别常规赛全新升级,提供轻量级中文场景文字识别数据,要求选手使用飞桨框架,对图像区域中的文字行进行预测,并返回文字行的内容。
(3) 赛题重点难点
- 本次比赛的重点是考验选手对工具的选择。对于本赛题,PaddleOCR是一个很好的选择,PP-OCR是在飞桨Paddle平台上发布的一种实用的超轻量级OCR系统,该系统由文本检测、检测框校正和文本识别三部分组成。
- 因此如何熟练使用PaddleOCR成为本次比赛的难点。
- 另外,想要取得令人满意的分数,数据增广以及合理的调参必不可少,同样也是本次比赛的难点。
(4) 数据介绍
-
本次赛题数据集共包括6万张图片,其中5万张图片作为训练集,1万张作为测试集。数据集采自中国街景,并由街景图片中的文字行区域(例如店铺标牌、地标等等)截取出来而形成。数据集中所有图像都经过一些预处理。
-
标注文件:平台提供的标注文件为.csv文件格式,文件中的四列分别为图片的宽、高、文件名和文字标注。
-
备注: 仅可使用比赛提供数据进行训练,不允许使用其他开源数据集进行训练。
(5) 个人方案亮点
- 本项目使用text_render进行数据增广,对配置文件进行相应调整。
- 另外,对参数调优进行重点考虑,合理调整相应超参数等等。
- 本项目还根据比赛提交要求,写了一个python文件,对txt文件乱序内容按照比赛要求进行排序,并生成结果文件。
一、环境安装
1.1 安装PaddleOCR
- AI Studio已经提供了paddlepaddle1.8.4及python3.7的环境,因此只需要参考官方教程安装PaddleOCR即可。
In [1]
!cd ~/work && git clone -b develop https://gitee.com/paddlepaddle/PaddleOCR.git
Cloning into 'PaddleOCR'... remote: Enumerating objects: 31443, done. remote: Counting objects: 100% (5822/5822), done. remote: Compressing objects: 100% (2358/2358), done. remote: Total 31443 (delta 3940), reused 5082 (delta 3373), pack-reused 25621 Receiving objects: 100% (31443/31443), 258.73 MiB | 37.88 MiB/s, done. Resolving deltas: 100% (21801/21801), done. Checking connectivity... done.
In [2]
!cd ~/work/PaddleOCR && pip install -r requirements.txt && python setup.py install
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting shapely Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9d/4d/4b0d86ed737acb29c5e627a91449470a9fb914f32640db3f1cb7ba5bc19e/Shapely-1.8.1.post1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 24.7 MB/s eta 0:00:0000:0100:01 Collecting imgaug Downloading https://pypi.tuna.tsinghua.edu.cn/packages/66/b1/af3142c4a85cba6da9f4ebb5ff4e21e2616309552caca5e8acefe9840622/imgaug-0.4.0-py2.py3-none-any.whl (948 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 948.0/948.0 KB 214.2 kB/s eta 0:00:0000:0100:01 Collecting pyclipper Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c5/fa/2c294127e4f88967149a68ad5b3e43636e94e3721109572f8f17ab15b772/pyclipper-1.3.0.post2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (603 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 603.5/603.5 KB 76.5 kB/s eta 0:00:0000:0100:01 Collecting lmdb Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4d/cf/3230b1c9b0bec406abb85a9332ba5805bdd03a1d24025c6bbcfb8ed71539/lmdb-1.3.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (298 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 298.8/298.8 KB 37.1 kB/s eta 0:00:0000:0100:01 Requirement already satisfied: tqdm in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 5)) (4.36.1) Requirement already satisfied: numpy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 6)) (1.16.4) Collecting opencv-python==4.2.0.32 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/34/a3/403dbaef909fee9f9f6a8eaff51d44085a14e5bb1a1ff7257117d744986a/opencv_python-4.2.0.32-cp37-cp37m-manylinux1_x86_64.whl (28.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 28.2/28.2 MB 1.9 MB/s eta 0:00:00:00:0100:01 Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug->-r requirements.txt (line 2)) (2.2.3) Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug->-r requirements.txt (line 2)) (1.16.0) Requirement already satisfied: Pillow in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug->-r requirements.txt (line 2)) (7.1.2) Collecting scikit-image>=0.14.2 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d2/d9/d16d4cbb4840e0fb3bd329b49184d240b82b649e1bd579489394fbc85c81/scikit_image-0.19.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.5/13.5 MB 1.3 MB/s eta 0:00:00:00:0100:01 Requirement already satisfied: scipy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug->-r requirements.txt (line 2)) (1.3.0) Requirement already satisfied: imageio in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug->-r requirements.txt (line 2)) (2.6.1) Requirement already satisfied: packaging>=20.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-image>=0.14.2->imgaug->-r requirements.txt (line 2)) (21.3) Collecting numpy Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6d/ad/ff3b21ebfe79a4d25b4a4f8e5cf9fd44a204adb6b33c09010f566f51027a/numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.7/15.7 MB 2.5 MB/s eta 0:00:00:00:0100:01 Collecting tifffile>=2019.7.26 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d8/38/85ae5ed77598ca90558c17a2f79ddaba33173b31cf8d8f545d34d9134f0d/tifffile-2021.11.2-py3-none-any.whl (178 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 178.9/178.9 KB 24.0 kB/s eta 0:00:00a 0:00:01 Collecting PyWavelets>=1.1.1 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ae/56/4441877073d8a5266dbf7b04c7f3dc66f1149c8efb9323e0ef987a9bb1ce/PyWavelets-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 833.4 kB/s eta 0:00:000:0100:01 Requirement already satisfied: networkx>=2.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-image>=0.14.2->imgaug->-r requirements.txt (line 2)) (2.4) Collecting scipy Downloading https://pypi.tuna.tsinghua.edu.cn/packages/58/4f/11f34cfc57ead25752a7992b069c36f5d18421958ebd6466ecd849aeaf86/scipy-1.7.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (38.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.1/38.1 MB 1.6 MB/s eta 0:00:00:00:0100:01 Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug->-r requirements.txt (line 2)) (0.10.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug->-r requirements.txt (line 2)) (3.0.7) Requirement already satisfied: pytz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug->-r requirements.txt (line 2)) (2022.1) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug->-r requirements.txt (line 2)) (1.1.0) Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug->-r requirements.txt (line 2)) (2.8.2) Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->imgaug->-r requirements.txt (line 2)) (41.4.0) Requirement already satisfied: decorator>=4.3.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from networkx>=2.2->scikit-image>=0.14.2->imgaug->-r requirements.txt (line 2)) (4.4.0) Installing collected packages: pyclipper, lmdb, shapely, numpy, tifffile, scipy, PyWavelets, opencv-python, scikit-image, imgaug Attempting uninstall: numpy Found existing installation: numpy 1.16.4 Uninstalling numpy-1.16.4: Successfully uninstalled numpy-1.16.4 Attempting uninstall: scipy Found existing installation: scipy 1.3.0 Uninstalling scipy-1.3.0: Successfully uninstalled scipy-1.3.0 Attempting uninstall: opencv-python Found existing installation: opencv-python 4.1.1.26 Uninstalling opencv-python-4.1.1.26: Successfully uninstalled opencv-python-4.1.1.26 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. parl 1.4.1 requires pyzmq==18.1.1, but you have pyzmq 22.3.0 which is incompatible. Successfully installed PyWavelets-1.3.0 imgaug-0.4.0 lmdb-1.3.0 numpy-1.21.6 opencv-python-4.2.0.32 pyclipper-1.3.0.post2 scikit-image-0.19.2 scipy-1.7.3 shapely-1.8.1.post1 tifffile-2021.11.2 running install running bdist_egg running egg_info creating paddleocr.egg-info writing paddleocr.egg-info/PKG-INFO writing dependency_links to paddleocr.egg-info/dependency_links.txt writing entry points to paddleocr.egg-info/entry_points.txt writing requirements to paddleocr.egg-info/requires.txt writing top-level names to paddleocr.egg-info/top_level.txt writing manifest file 'paddleocr.egg-info/SOURCES.txt' reading manifest file 'paddleocr.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no files found matching 'LICENSE.txt' writing manifest file 'paddleocr.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py creating build creating build/lib creating build/lib/paddleocr copying paddleocr.py -> build/lib/paddleocr copying __init__.py -> build/lib/paddleocr copying MANIFEST.in -> build/lib/paddleocr copying README.md -> build/lib/paddleocr creating build/lib/paddleocr/paddleocr.egg-info copying paddleocr.egg-info/PKG-INFO -> build/lib/paddleocr/paddleocr.egg-info copying paddleocr.egg-info/SOURCES.txt -> build/lib/paddleocr/paddleocr.egg-info copying paddleocr.egg-info/dependency_links.txt -> build/lib/paddleocr/paddleocr.egg-info copying paddleocr.egg-info/entry_points.txt -> build/lib/paddleocr/paddleocr.egg-info copying paddleocr.egg-info/requires.txt -> build/lib/paddleocr/paddleocr.egg-info copying paddleocr.egg-info/top_level.txt -> build/lib/paddleocr/paddleocr.egg-info creating build/lib/paddleocr/ppocr creating build/lib/paddleocr/ppocr/data creating build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/__init__.py -> build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/data_augment.py -> build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/dataset_traversal.py -> build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/db_process.py -> build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/east_process.py -> build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/make_border_map.py -> build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/make_shrink_map.py -> build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/random_crop_data.py -> build/lib/paddleocr/ppocr/data/det copying ppocr/data/det/sast_process.py -> build/lib/paddleocr/ppocr/data/det creating build/lib/paddleocr/ppocr/postprocess copying ppocr/postprocess/__init__.py -> build/lib/paddleocr/ppocr/postprocess copying ppocr/postprocess/db_postprocess.py -> build/lib/paddleocr/ppocr/postprocess copying ppocr/postprocess/east_postprocess.py -> build/lib/paddleocr/ppocr/postprocess copying ppocr/postprocess/locality_aware_nms.py -> build/lib/paddleocr/ppocr/postprocess copying ppocr/postprocess/sast_postprocess.py -> build/lib/paddleocr/ppocr/postprocess creating build/lib/paddleocr/ppocr/postprocess/lanms copying ppocr/postprocess/lanms/.gitignore -> build/lib/paddleocr/ppocr/postprocess/lanms copying ppocr/postprocess/lanms/.ycm_extra_conf.py -> build/lib/paddleocr/ppocr/postprocess/lanms copying ppocr/postprocess/lanms/__init__.py -> build/lib/paddleocr/ppocr/postprocess/lanms copying ppocr/postprocess/lanms/__main__.py -> build/lib/paddleocr/ppocr/postprocess/lanms copying ppocr/postprocess/lanms/adaptor.cpp -> build/lib/paddleocr/ppocr/postprocess/lanms copying ppocr/postprocess/lanms/lanms.h -> build/lib/paddleocr/ppocr/postprocess/lanms creating build/lib/paddleocr/ppocr/postprocess/lanms/include creating build/lib/paddleocr/ppocr/postprocess/lanms/include/clipper copying ppocr/postprocess/lanms/include/clipper/clipper.cpp -> build/lib/paddleocr/ppocr/postprocess/lanms/include/clipper copying ppocr/postprocess/lanms/include/clipper/clipper.hpp -> build/lib/paddleocr/ppocr/postprocess/lanms/include/clipper creating build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/attr.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/buffer_info.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/cast.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/chrono.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/class_support.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/common.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/complex.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/descr.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/eigen.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/embed.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/eval.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/functional.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/numpy.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/operators.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/options.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/pybind11.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/pytypes.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/stl.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/stl_bind.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying ppocr/postprocess/lanms/include/pybind11/typeid.h -> build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11 creating build/lib/paddleocr/ppocr/utils copying ppocr/utils/character.py -> build/lib/paddleocr/ppocr/utils copying ppocr/utils/check.py -> build/lib/paddleocr/ppocr/utils copying ppocr/utils/ic15_dict.txt -> build/lib/paddleocr/ppocr/utils copying ppocr/utils/ppocr_keys_v1.txt -> build/lib/paddleocr/ppocr/utils copying ppocr/utils/utility.py -> build/lib/paddleocr/ppocr/utils creating build/lib/paddleocr/ppocr/utils/corpus copying ppocr/utils/corpus/occitan_corpus.txt -> build/lib/paddleocr/ppocr/utils/corpus creating build/lib/paddleocr/ppocr/utils/dict copying ppocr/utils/dict/french_dict.txt -> build/lib/paddleocr/ppocr/utils/dict copying ppocr/utils/dict/german_dict.txt -> build/lib/paddleocr/ppocr/utils/dict copying ppocr/utils/dict/japan_dict.txt -> build/lib/paddleocr/ppocr/utils/dict copying ppocr/utils/dict/korean_dict.txt -> build/lib/paddleocr/ppocr/utils/dict copying ppocr/utils/dict/occitan_dict.txt -> build/lib/paddleocr/ppocr/utils/dict creating build/lib/paddleocr/tools creating build/lib/paddleocr/tools/infer copying tools/infer/__init__.py -> build/lib/paddleocr/tools/infer copying tools/infer/predict_cls.py -> build/lib/paddleocr/tools/infer copying tools/infer/predict_det.py -> build/lib/paddleocr/tools/infer copying tools/infer/predict_rec.py -> build/lib/paddleocr/tools/infer copying tools/infer/predict_system.py -> build/lib/paddleocr/tools/infer copying tools/infer/utility.py -> build/lib/paddleocr/tools/infer creating build/bdist.linux-x86_64 creating build/bdist.linux-x86_64/egg creating build/bdist.linux-x86_64/egg/paddleocr copying build/lib/paddleocr/paddleocr.py -> build/bdist.linux-x86_64/egg/paddleocr copying build/lib/paddleocr/MANIFEST.in -> build/bdist.linux-x86_64/egg/paddleocr creating build/bdist.linux-x86_64/egg/paddleocr/ppocr creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess copying build/lib/paddleocr/ppocr/postprocess/locality_aware_nms.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess copying build/lib/paddleocr/ppocr/postprocess/sast_postprocess.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess copying build/lib/paddleocr/ppocr/postprocess/db_postprocess.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms copying build/lib/paddleocr/ppocr/postprocess/lanms/adaptor.cpp -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms copying build/lib/paddleocr/ppocr/postprocess/lanms/lanms.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/clipper copying build/lib/paddleocr/ppocr/postprocess/lanms/include/clipper/clipper.hpp -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/clipper copying build/lib/paddleocr/ppocr/postprocess/lanms/include/clipper/clipper.cpp -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/clipper creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/common.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/eigen.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/options.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/typeid.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/attr.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/descr.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/pybind11.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/stl.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/buffer_info.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/complex.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/cast.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/pytypes.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/operators.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/functional.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/eval.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/embed.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/stl_bind.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/class_support.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/chrono.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/include/pybind11/numpy.h -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/include/pybind11 copying build/lib/paddleocr/ppocr/postprocess/lanms/.ycm_extra_conf.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms copying build/lib/paddleocr/ppocr/postprocess/lanms/.gitignore -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms copying build/lib/paddleocr/ppocr/postprocess/lanms/__main__.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms copying build/lib/paddleocr/ppocr/postprocess/lanms/__init__.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms copying build/lib/paddleocr/ppocr/postprocess/east_postprocess.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess copying build/lib/paddleocr/ppocr/postprocess/__init__.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/dict copying build/lib/paddleocr/ppocr/utils/dict/occitan_dict.txt -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/dict copying build/lib/paddleocr/ppocr/utils/dict/korean_dict.txt -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/dict copying build/lib/paddleocr/ppocr/utils/dict/french_dict.txt -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/dict copying build/lib/paddleocr/ppocr/utils/dict/german_dict.txt -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/dict copying build/lib/paddleocr/ppocr/utils/dict/japan_dict.txt -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/dict creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/corpus copying build/lib/paddleocr/ppocr/utils/corpus/occitan_corpus.txt -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/corpus copying build/lib/paddleocr/ppocr/utils/check.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils copying build/lib/paddleocr/ppocr/utils/ppocr_keys_v1.txt -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils copying build/lib/paddleocr/ppocr/utils/utility.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils copying build/lib/paddleocr/ppocr/utils/character.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils copying build/lib/paddleocr/ppocr/utils/ic15_dict.txt -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/data creating build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/make_shrink_map.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/random_crop_data.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/sast_process.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/db_process.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/make_border_map.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/dataset_traversal.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/data_augment.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/east_process.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det copying build/lib/paddleocr/ppocr/data/det/__init__.py -> build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det creating build/bdist.linux-x86_64/egg/paddleocr/paddleocr.egg-info copying build/lib/paddleocr/paddleocr.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/paddleocr/paddleocr.egg-info copying build/lib/paddleocr/paddleocr.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/paddleocr/paddleocr.egg-info copying build/lib/paddleocr/paddleocr.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/paddleocr/paddleocr.egg-info copying build/lib/paddleocr/paddleocr.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/paddleocr/paddleocr.egg-info copying build/lib/paddleocr/paddleocr.egg-info/entry_points.txt -> build/bdist.linux-x86_64/egg/paddleocr/paddleocr.egg-info copying build/lib/paddleocr/paddleocr.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/paddleocr/paddleocr.egg-info creating build/bdist.linux-x86_64/egg/paddleocr/tools creating build/bdist.linux-x86_64/egg/paddleocr/tools/infer copying build/lib/paddleocr/tools/infer/predict_rec.py -> build/bdist.linux-x86_64/egg/paddleocr/tools/infer copying build/lib/paddleocr/tools/infer/utility.py -> build/bdist.linux-x86_64/egg/paddleocr/tools/infer copying build/lib/paddleocr/tools/infer/predict_system.py -> build/bdist.linux-x86_64/egg/paddleocr/tools/infer copying build/lib/paddleocr/tools/infer/predict_cls.py -> build/bdist.linux-x86_64/egg/paddleocr/tools/infer copying build/lib/paddleocr/tools/infer/predict_det.py -> build/bdist.linux-x86_64/egg/paddleocr/tools/infer copying build/lib/paddleocr/tools/infer/__init__.py -> build/bdist.linux-x86_64/egg/paddleocr/tools/infer copying build/lib/paddleocr/README.md -> build/bdist.linux-x86_64/egg/paddleocr copying build/lib/paddleocr/__init__.py -> build/bdist.linux-x86_64/egg/paddleocr byte-compiling build/bdist.linux-x86_64/egg/paddleocr/paddleocr.py to paddleocr.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/locality_aware_nms.py to locality_aware_nms.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/sast_postprocess.py to sast_postprocess.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/db_postprocess.py to db_postprocess.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/.ycm_extra_conf.py to .ycm_extra_conf.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/__main__.py to __main__.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/lanms/__init__.py to __init__.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/east_postprocess.py to east_postprocess.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/postprocess/__init__.py to __init__.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/check.py to check.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/utility.py to utility.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/utils/character.py to character.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/make_shrink_map.py to make_shrink_map.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/random_crop_data.py to random_crop_data.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/sast_process.py to sast_process.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/db_process.py to db_process.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/make_border_map.py to make_border_map.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/dataset_traversal.py to dataset_traversal.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/data_augment.py to data_augment.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/east_process.py to east_process.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/ppocr/data/det/__init__.py to __init__.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/tools/infer/predict_rec.py to predict_rec.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/tools/infer/utility.py to utility.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/tools/infer/predict_system.py to predict_system.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/tools/infer/predict_cls.py to predict_cls.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/tools/infer/predict_det.py to predict_det.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/tools/infer/__init__.py to __init__.cpython-37.pyc byte-compiling build/bdist.linux-x86_64/egg/paddleocr/__init__.py to __init__.cpython-37.pyc creating build/bdist.linux-x86_64/egg/EGG-INFO copying paddleocr.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO copying paddleocr.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying paddleocr.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying paddleocr.egg-info/entry_points.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying paddleocr.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying paddleocr.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO zip_safe flag not set; analyzing archive contents... paddleocr.__pycache__.paddleocr.cpython-37: module references __file__ paddleocr.ppocr.postprocess.__pycache__.east_postprocess.cpython-37: module references __file__ paddleocr.ppocr.postprocess.__pycache__.sast_postprocess.cpython-37: module references __file__ paddleocr.ppocr.postprocess.lanms.__pycache__..ycm_extra_conf.cpython-37: module references __file__ paddleocr.ppocr.postprocess.lanms.__pycache__.__init__.cpython-37: module references __file__ paddleocr.tools.infer.__pycache__.predict_cls.cpython-37: module references __file__ paddleocr.tools.infer.__pycache__.predict_det.cpython-37: module references __file__ paddleocr.tools.infer.__pycache__.predict_rec.cpython-37: module references __file__ paddleocr.tools.infer.__pycache__.predict_system.cpython-37: module references __file__ creating dist creating 'dist/paddleocr-1.1.2-py3.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it removing 'build/bdist.linux-x86_64/egg' (and everything under it) Processing paddleocr-1.1.2-py3.7.egg creating /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddleocr-1.1.2-py3.7.egg Extracting paddleocr-1.1.2-py3.7.egg to /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Adding paddleocr 1.1.2 to easy-install.pth file Installing paddleocr script to /opt/conda/envs/python35-paddle120-env/bin Installed /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddleocr-1.1.2-py3.7.egg Processing dependencies for paddleocr==1.1.2 Searching for tqdm==4.36.1 Best match: tqdm 4.36.1 Adding tqdm 4.36.1 to easy-install.pth file Installing tqdm script to /opt/conda/envs/python35-paddle120-env/bin Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for opencv-python==4.2.0.32 Best match: opencv-python 4.2.0.32 Adding opencv-python 4.2.0.32 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for numpy==1.21.6 Best match: numpy 1.21.6 Adding numpy 1.21.6 to easy-install.pth file Installing f2py script to /opt/conda/envs/python35-paddle120-env/bin Installing f2py3 script to /opt/conda/envs/python35-paddle120-env/bin Installing f2py3.7 script to /opt/conda/envs/python35-paddle120-env/bin Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for lmdb==1.3.0 Best match: lmdb 1.3.0 Adding lmdb 1.3.0 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for pyclipper==1.3.0.post2 Best match: pyclipper 1.3.0.post2 Adding pyclipper 1.3.0.post2 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for imgaug==0.4.0 Best match: imgaug 0.4.0 Adding imgaug 0.4.0 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for Shapely==1.8.1.post1 Best match: Shapely 1.8.1.post1 Adding Shapely 1.8.1.post1 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for Pillow==7.1.2 Best match: Pillow 7.1.2 Adding Pillow 7.1.2 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for scipy==1.7.3 Best match: scipy 1.7.3 Adding scipy 1.7.3 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for imageio==2.6.1 Best match: imageio 2.6.1 Adding imageio 2.6.1 to easy-install.pth file Installing imageio_download_bin script to /opt/conda/envs/python35-paddle120-env/bin Installing imageio_remove_bin script to /opt/conda/envs/python35-paddle120-env/bin Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for scikit-image==0.19.2 Best match: scikit-image 0.19.2 Adding scikit-image 0.19.2 to easy-install.pth file Installing skivi script to /opt/conda/envs/python35-paddle120-env/bin Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for matplotlib==2.2.3 Best match: matplotlib 2.2.3 Adding matplotlib 2.2.3 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for six==1.16.0 Best match: six 1.16.0 Adding six 1.16.0 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for PyWavelets==1.3.0 Best match: PyWavelets 1.3.0 Adding PyWavelets 1.3.0 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for packaging==21.3 Best match: packaging 21.3 Adding packaging 21.3 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for tifffile==2021.11.2 Best match: tifffile 2021.11.2 Adding tifffile 2021.11.2 to easy-install.pth file Installing lsm2bin script to /opt/conda/envs/python35-paddle120-env/bin Installing tiff2fsspec script to /opt/conda/envs/python35-paddle120-env/bin Installing tiffcomment script to /opt/conda/envs/python35-paddle120-env/bin Installing tifffile script to /opt/conda/envs/python35-paddle120-env/bin Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for networkx==2.4 Best match: networkx 2.4 Adding networkx 2.4 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for cycler==0.10.0 Best match: cycler 0.10.0 Adding cycler 0.10.0 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for pyparsing==3.0.7 Best match: pyparsing 3.0.7 Adding pyparsing 3.0.7 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for pytz==2022.1 Best match: pytz 2022.1 Adding pytz 2022.1 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for python-dateutil==2.8.2 Best match: python-dateutil 2.8.2 Adding python-dateutil 2.8.2 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for kiwisolver==1.1.0 Best match: kiwisolver 1.1.0 Adding kiwisolver 1.1.0 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for decorator==4.4.0 Best match: decorator 4.4.0 Adding decorator 4.4.0 to easy-install.pth file Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Searching for setuptools==41.4.0 Best match: setuptools 41.4.0 Adding setuptools 41.4.0 to easy-install.pth file Installing easy_install script to /opt/conda/envs/python35-paddle120-env/bin Using /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages Finished processing dependencies for paddleocr==1.1.2
二、数据处理
2.1 解压数据集
In [3]
!cd ~/data/data62842/ && unzip train_images.zip
!cd ~/data/data62843/ && unzip test_images.zip
inflating: test_images/3359.jpg inflating: test_images/4436.jpg inflating: test_images/5728.jpg inflating: test_images/2047.jpg inflating: test_images/6221.jpg inflating: test_images/9112.jpg inflating: test_images/7659.jpg inflating: test_images/964.jpg inflating: test_images/6547.jpg inflating: test_images/9674.jpg inflating: test_images/7881.jpg inflating: test_images/1228.jpg inflating: test_images/2721.jpg inflating: test_images/4350.jpg inflating: test_images/2735.jpg inflating: test_images/4344.jpg inflating: test_images/970.jpg inflating: test_images/6553.jpg inflating: test_images/9660.jpg inflating: test_images/7895.jpg inflating: test_images/6235.jpg inflating: test_images/9106.jpg inflating: test_images/8218.jpg inflating: test_images/4422.jpg inflating: test_images/2053.jpg inflating: test_images/743.jpg inflating: test_images/9853.jpg inflating: test_images/8595.jpg inflating: test_images/2906.jpg inflating: test_images/2912.jpg inflating: test_images/757.jpg inflating: test_images/9847.jpg inflating: test_images/8581.jpg inflating: test_images/2084.jpg inflating: test_images/5933.jpg inflating: test_images/4393.jpg inflating: test_images/6584.jpg inflating: test_images/7842.jpg inflating: test_images/6590.jpg inflating: test_images/7856.jpg inflating: test_images/4387.jpg inflating: test_images/5099.jpg inflating: test_images/2090.jpg inflating: test_images/5927.jpg inflating: test_images/1599.jpg inflating: test_images/5098.jpg inflating: test_images/4386.jpg inflating: test_images/7857.jpg inflating: test_images/6591.jpg inflating: test_images/1598.jpg inflating: test_images/5926.jpg inflating: test_images/2091.jpg inflating: test_images/5932.jpg inflating: test_images/2085.jpg inflating: test_images/7843.jpg inflating: test_images/6585.jpg inflating: test_images/4392.jpg inflating: test_images/8580.jpg inflating: test_images/9846.jpg inflating: test_images/756.jpg inflating: test_images/2913.jpg inflating: test_images/2907.jpg inflating: test_images/8594.jpg inflating: test_images/9852.jpg inflating: test_images/742.jpg inflating: test_images/7894.jpg inflating: test_images/9661.jpg inflating: test_images/6552.jpg inflating: test_images/971.jpg inflating: test_images/4345.jpg inflating: test_images/2734.jpg inflating: test_images/2052.jpg inflating: test_images/4423.jpg inflating: test_images/8219.jpg inflating: test_images/9107.jpg inflating: test_images/6234.jpg inflating: test_images/9113.jpg inflating: test_images/6220.jpg inflating: test_images/2046.jpg inflating: test_images/5729.jpg inflating: test_images/4437.jpg inflating: test_images/3358.jpg inflating: test_images/4351.jpg inflating: test_images/2720.jpg inflating: test_images/1229.jpg inflating: test_images/7880.jpg inflating: test_images/9675.jpg inflating: test_images/6546.jpg inflating: test_images/965.jpg inflating: test_images/7658.jpg inflating: test_images/3416.jpg inflating: test_images/4379.jpg inflating: test_images/5067.jpg inflating: test_images/2708.jpg inflating: test_images/1201.jpg inflating: test_images/795.jpg inflating: test_images/7670.jpg inflating: test_images/8543.jpg inflating: test_images/9885.jpg inflating: test_images/7116.jpg inflating: test_images/8225.jpg inflating: test_images/6208.jpg inflating: test_images/1567.jpg inflating: test_images/5701.jpg inflating: test_images/3370.jpg inflating: test_images/5715.jpg inflating: test_images/3364.jpg inflating: test_images/7102.jpg inflating: test_images/8231.jpg inflating: test_images/1573.jpg inflating: test_images/781.jpg inflating: test_images/1215.jpg inflating: test_images/9649.jpg inflating: test_images/7664.jpg inflating: test_images/8557.jpg inflating: test_images/959.jpg inflating: test_images/9891.jpg inflating: test_images/3402.jpg inflating: test_images/5073.jpg inflating: test_images/2293.jpg inflating: test_images/1942.jpg inflating: test_images/6793.jpg inflating: test_images/568.jpg inflating: test_images/8966.jpg inflating: test_images/4184.jpg inflating: test_images/3833.jpg inflating: test_images/4190.jpg inflating: test_images/3827.jpg inflating: test_images/6787.jpg inflating: test_images/8972.jpg inflating: test_images/7499.jpg inflating: test_images/1956.jpg inflating: test_images/2287.jpg inflating: test_images/3199.jpg inflating: test_images/232.jpg inflating: test_images/4806.jpg inflating: test_images/554.jpg inflating: test_images/6977.jpg inflating: test_images/8782.jpg inflating: test_images/9488.jpg inflating: test_images/540.jpg inflating: test_images/6963.jpg inflating: test_images/8796.jpg inflating: test_images/4812.jpg inflating: test_images/226.jpg inflating: test_images/1759.jpg inflating: test_images/6036.jpg inflating: test_images/9305.jpg inflating: test_images/7328.jpg inflating: test_images/1981.jpg inflating: test_images/4621.jpg inflating: test_images/2250.jpg inflating: test_images/2536.jpg inflating: test_images/5259.jpg inflating: test_images/4147.jpg inflating: test_images/3628.jpg inflating: test_images/6988.jpg inflating: test_images/6750.jpg inflating: test_images/9463.jpg inflating: test_images/8769.jpg inflating: test_images/6744.jpg inflating: test_images/9477.jpg inflating: test_images/2522.jpg inflating: test_images/4153.jpg inflating: test_images/4635.jpg inflating: test_images/2244.jpg inflating: test_images/6022.jpg inflating: test_images/9311.jpg inflating: test_images/1995.jpg inflating: test_images/3172.jpg inflating: test_images/5503.jpg inflating: test_images/1765.jpg inflating: test_images/9339.jpg inflating: test_images/8027.jpg inflating: test_images/7314.jpg inflating: test_images/8741.jpg inflating: test_images/7472.jpg inflating: test_images/8999.jpg inflating: test_images/597.jpg inflating: test_images/1003.jpg inflating: test_images/5265.jpg inflating: test_images/3614.jpg inflating: test_images/5271.jpg inflating: test_images/3600.jpg inflating: test_images/8755.jpg inflating: test_images/7466.jpg inflating: test_images/6778.jpg inflating: test_images/1017.jpg inflating: test_images/583.jpg inflating: test_images/1771.jpg inflating: test_images/8033.jpg inflating: test_images/7300.jpg inflating: test_images/3166.jpg inflating: test_images/4609.jpg inflating: test_images/5517.jpg inflating: test_images/2278.jpg inflating: test_images/4796.jpg inflating: test_images/5488.jpg inflating: test_images/6181.jpg inflating: test_images/1836.jpg inflating: test_images/8812.jpg inflating: test_images/1188.jpg inflating: test_images/3947.jpg inflating: test_images/2481.jpg inflating: test_images/3953.jpg inflating: test_images/2495.jpg inflating: test_images/408.jpg inflating: test_images/8806.jpg inflating: test_images/6195.jpg inflating: test_images/1822.jpg inflating: test_images/4782.jpg inflating: test_images/346.jpg inflating: test_images/8190.jpg inflating: test_images/4972.jpg inflating: test_images/6803.jpg inflating: test_images/420.jpg inflating: test_images/6817.jpg inflating: test_images/434.jpg inflating: test_images/4966.jpg inflating: test_images/352.jpg inflating: test_images/8184.jpg inflating: test_images/6142.jpg inflating: test_images/9271.jpg inflating: test_images/2324.jpg inflating: test_images/4755.jpg inflating: test_images/4033.jpg inflating: test_images/3984.jpg inflating: test_images/2442.jpg inflating: test_images/6624.jpg inflating: test_images/9517.jpg inflating: test_images/8609.jpg inflating: test_images/6630.jpg inflating: test_images/9503.jpg inflating: test_images/3748.jpg inflating: test_images/4027.jpg inflating: test_images/3990.jpg inflating: test_images/5339.jpg inflating: test_images/2456.jpg inflating: test_images/2330.jpg inflating: test_images/4999.jpg inflating: test_images/4741.jpg inflating: test_images/7248.jpg inflating: test_images/6156.jpg inflating: test_images/9265.jpg inflating: test_images/1639.jpg inflating: test_images/2318.jpg inflating: test_images/5477.jpg inflating: test_images/4769.jpg inflating: test_images/3006.jpg inflating: test_images/8153.jpg inflating: test_images/7260.jpg inflating: test_images/385.jpg inflating: test_images/1611.jpg inflating: test_images/1177.jpg inflating: test_images/6618.jpg inflating: test_images/8635.jpg inflating: test_images/7506.jpg inflating: test_images/3760.jpg inflating: test_images/5311.jpg inflating: test_images/3774.jpg inflating: test_images/5305.jpg inflating: test_images/1163.jpg inflating: test_images/8621.jpg inflating: test_images/7512.jpg inflating: test_images/8147.jpg inflating: test_images/7274.jpg inflating: test_images/9259.jpg inflating: test_images/1605.jpg inflating: test_images/391.jpg inflating: test_images/5463.jpg inflating: test_images/3012.jpg inflating: test_images/2683.jpg inflating: test_images/86.jpg inflating: test_images/7923.jpg inflating: test_images/178.jpg inflating: test_images/6383.jpg inflating: test_images/5852.jpg inflating: test_images/4594.jpg inflating: test_images/5846.jpg inflating: test_images/4580.jpg inflating: test_images/7089.jpg inflating: test_images/6397.jpg inflating: test_images/92.jpg inflating: test_images/7937.jpg inflating: test_images/3589.jpg inflating: test_images/2697.jpg inflating: test_images/622.jpg inflating: test_images/9932.jpg inflating: test_images/2867.jpg inflating: test_images/8392.jpg inflating: test_images/144.jpg inflating: test_images/8386.jpg inflating: test_images/150.jpg inflating: test_images/9098.jpg inflating: test_images/2873.jpg inflating: test_images/636.jpg inflating: test_images/9926.jpg inflating: test_images/805.jpg inflating: test_images/7738.jpg inflating: test_images/9715.jpg inflating: test_images/6426.jpg inflating: test_images/45.jpg inflating: test_images/1349.jpg inflating: test_images/2640.jpg inflating: test_images/4231.jpg inflating: test_images/2898.jpg inflating: test_images/5891.jpg inflating: test_images/3238.jpg inflating: test_images/4557.jpg inflating: test_images/5649.jpg inflating: test_images/2126.jpg inflating: test_images/9073.jpg inflating: test_images/6340.jpg inflating: test_images/9067.jpg inflating: test_images/6354.jpg inflating: test_images/8379.jpg inflating: test_images/5885.jpg inflating: test_images/3.jpg inflating: test_images/4543.jpg inflating: test_images/2132.jpg inflating: test_images/2654.jpg inflating: test_images/4225.jpg inflating: test_images/811.jpg inflating: test_images/9701.jpg inflating: test_images/51.jpg inflating: test_images/6432.jpg inflating: test_images/5113.jpg inflating: test_images/3562.jpg inflating: test_images/7704.jpg inflating: test_images/8437.jpg inflating: test_images/839.jpg inflating: test_images/79.jpg inflating: test_images/9729.jpg inflating: test_images/1375.jpg inflating: test_images/1413.jpg inflating: test_images/187.jpg inflating: test_images/7062.jpg inflating: test_images/8351.jpg inflating: test_images/3204.jpg inflating: test_images/5675.jpg inflating: test_images/3210.jpg inflating: test_images/5661.jpg inflating: test_images/193.jpg inflating: test_images/1407.jpg inflating: test_images/6368.jpg inflating: test_images/7076.jpg inflating: test_images/8345.jpg inflating: test_images/7710.jpg inflating: test_images/8423.jpg inflating: test_images/1361.jpg inflating: test_images/2668.jpg inflating: test_images/5107.jpg inflating: test_images/4219.jpg inflating: test_images/3576.jpg inflating: test_images/2118.jpg inflating: test_images/5677.jpg inflating: test_images/4569.jpg inflating: test_images/3206.jpg inflating: test_images/7060.jpg inflating: test_images/8353.jpg inflating: test_images/185.jpg inflating: test_images/1411.jpg inflating: test_images/1377.jpg inflating: test_images/6418.jpg inflating: test_images/7706.jpg inflating: test_images/8435.jpg inflating: test_images/3560.jpg inflating: test_images/5111.jpg inflating: test_images/3574.jpg inflating: test_images/5105.jpg inflating: test_images/1363.jpg inflating: test_images/7712.jpg inflating: test_images/8421.jpg inflating: test_images/7074.jpg inflating: test_images/8347.jpg inflating: test_images/9059.jpg inflating: test_images/1405.jpg inflating: test_images/191.jpg inflating: test_images/5663.jpg inflating: test_images/3212.jpg inflating: test_images/9071.jpg inflating: test_images/6342.jpg inflating: test_images/2124.jpg inflating: test_images/4555.jpg inflating: test_images/5893.jpg inflating: test_images/4233.jpg inflating: test_images/2642.jpg inflating: test_images/9717.jpg inflating: test_images/47.jpg inflating: test_images/6424.jpg inflating: test_images/807.jpg inflating: test_images/8409.jpg inflating: test_images/9703.jpg inflating: test_images/6430.jpg inflating: test_images/53.jpg inflating: test_images/813.jpg inflating: test_images/3548.jpg inflating: test_images/4227.jpg inflating: test_images/5139.jpg inflating: test_images/2656.jpg inflating: test_images/2130.jpg inflating: test_images/4541.jpg inflating: test_images/1.jpg inflating: test_images/5887.jpg inflating: test_images/7048.jpg inflating: test_images/9065.jpg inflating: test_images/6356.jpg inflating: test_images/1439.jpg inflating: test_images/146.jpg inflating: test_images/8390.jpg inflating: test_images/2865.jpg inflating: test_images/9930.jpg inflating: test_images/620.jpg inflating: test_images/9924.jpg inflating: test_images/7909.jpg inflating: test_images/634.jpg inflating: test_images/2871.jpg inflating: test_images/5878.jpg inflating: test_images/152.jpg inflating: test_images/8384.jpg inflating: test_images/4596.jpg inflating: test_images/5850.jpg inflating: test_images/5688.jpg inflating: test_images/6381.jpg inflating: test_images/1388.jpg inflating: test_images/7921.jpg inflating: test_images/84.jpg inflating: test_images/2681.jpg inflating: test_images/2859.jpg inflating: test_images/2695.jpg inflating: test_images/9918.jpg inflating: test_images/608.jpg inflating: test_images/7935.jpg inflating: test_images/90.jpg inflating: test_images/6395.jpg inflating: test_images/4582.jpg inflating: test_images/5844.jpg inflating: test_images/5313.jpg inflating: test_images/3762.jpg inflating: test_images/8637.jpg inflating: test_images/7504.jpg inflating: test_images/9529.jpg inflating: test_images/1175.jpg inflating: test_images/1613.jpg inflating: test_images/387.jpg inflating: test_images/8151.jpg inflating: test_images/7262.jpg inflating: test_images/3004.jpg inflating: test_images/5475.jpg inflating: test_images/3010.jpg inflating: test_images/5461.jpg inflating: test_images/393.jpg inflating: test_images/1607.jpg inflating: test_images/6168.jpg inflating: test_images/8145.jpg inflating: test_images/7276.jpg inflating: test_images/8623.jpg inflating: test_images/7510.jpg inflating: test_images/1161.jpg inflating: test_images/2468.jpg inflating: test_images/5307.jpg inflating: test_images/4019.jpg inflating: test_images/3776.jpg inflating: test_images/7538.jpg inflating: test_images/6626.jpg inflating: test_images/9515.jpg inflating: test_images/1149.jpg inflating: test_images/2440.jpg inflating: test_images/3986.jpg inflating: test_images/4031.jpg inflating: test_images/3038.jpg inflating: test_images/4757.jpg inflating: test_images/5449.jpg inflating: test_images/2326.jpg inflating: test_images/6140.jpg inflating: test_images/9273.jpg inflating: test_images/6154.jpg inflating: test_images/9267.jpg inflating: test_images/8179.jpg inflating: test_images/4743.jpg inflating: test_images/2332.jpg inflating: test_images/2454.jpg inflating: test_images/3992.jpg inflating: test_images/4025.jpg inflating: test_images/6632.jpg inflating: test_images/9501.jpg inflating: test_images/422.jpg inflating: test_images/6801.jpg inflating: test_images/3979.jpg inflating: test_images/4970.jpg inflating: test_images/1808.jpg inflating: test_images/8192.jpg inflating: test_images/344.jpg inflating: test_images/8186.jpg inflating: test_images/350.jpg inflating: test_images/9298.jpg inflating: test_images/4964.jpg inflating: test_images/8838.jpg inflating: test_images/436.jpg inflating: test_images/6815.jpg inflating: test_images/2483.jpg inflating: test_images/3945.jpg inflating: test_images/8810.jpg inflating: test_images/1834.jpg inflating: test_images/378.jpg inflating: test_images/6183.jpg inflating: test_images/4794.jpg inflating: test_images/4958.jpg inflating: test_images/4780.jpg inflating: test_images/7289.jpg inflating: test_images/1820.jpg inflating: test_images/6197.jpg inflating: test_images/8804.jpg inflating: test_images/6829.jpg inflating: test_images/3789.jpg inflating: test_images/2497.jpg inflating: test_images/3951.jpg inflating: test_images/3616.jpg inflating: test_images/4179.jpg inflating: test_images/5267.jpg inflating: test_images/2508.jpg inflating: test_images/1001.jpg inflating: test_images/595.jpg inflating: test_images/8743.jpg inflating: test_images/7470.jpg inflating: test_images/8025.jpg inflating: test_images/7316.jpg inflating: test_images/6008.jpg inflating: test_images/1767.jpg inflating: test_images/5501.jpg inflating: test_images/3170.jpg inflating: test_images/5515.jpg inflating: test_images/3164.jpg inflating: test_images/8031.jpg inflating: test_images/7302.jpg inflating: test_images/1773.jpg inflating: test_images/581.jpg inflating: test_images/1015.jpg inflating: test_images/9449.jpg inflating: test_images/8757.jpg inflating: test_images/7464.jpg inflating: test_images/3602.jpg inflating: test_images/5273.jpg inflating: test_images/6752.jpg inflating: test_images/9461.jpg inflating: test_images/4145.jpg inflating: test_images/2534.jpg inflating: test_images/2252.jpg inflating: test_images/4623.jpg inflating: test_images/1983.jpg inflating: test_images/8019.jpg inflating: test_images/6034.jpg inflating: test_images/9307.jpg inflating: test_images/1997.jpg inflating: test_images/6020.jpg inflating: test_images/9313.jpg inflating: test_images/2246.jpg inflating: test_images/5529.jpg inflating: test_images/4637.jpg inflating: test_images/3158.jpg inflating: test_images/4151.jpg inflating: test_images/2520.jpg inflating: test_images/1029.jpg inflating: test_images/6746.jpg inflating: test_images/9475.jpg inflating: test_images/7458.jpg inflating: test_images/8780.jpg inflating: test_images/6975.jpg inflating: test_images/8958.jpg inflating: test_images/556.jpg inflating: test_images/4804.jpg inflating: test_images/230.jpg inflating: test_images/224.jpg inflating: test_images/1968.jpg inflating: test_images/4810.jpg inflating: test_images/3819.jpg inflating: test_images/8794.jpg inflating: test_images/6961.jpg inflating: test_images/542.jpg inflating: test_images/3831.jpg inflating: test_images/5298.jpg inflating: test_images/4186.jpg inflating: test_images/6949.jpg inflating: test_images/8964.jpg inflating: test_images/6791.jpg inflating: test_images/1798.jpg inflating: test_images/1940.jpg inflating: test_images/2291.jpg inflating: test_images/4838.jpg inflating: test_images/2285.jpg inflating: test_images/218.jpg inflating: test_images/1954.jpg inflating: test_images/8970.jpg inflating: test_images/6785.jpg inflating: test_images/3825.jpg inflating: test_images/4192.jpg inflating: test_images/3372.jpg inflating: test_images/5703.jpg inflating: test_images/1565.jpg inflating: test_images/9139.jpg inflating: test_images/7114.jpg inflating: test_images/8227.jpg inflating: test_images/9887.jpg inflating: test_images/7672.jpg inflating: test_images/8541.jpg inflating: test_images/797.jpg inflating: test_images/1203.jpg inflating: test_images/5065.jpg inflating: test_images/3414.jpg inflating: test_images/5071.jpg inflating: test_images/3400.jpg inflating: test_images/9893.jpg inflating: test_images/7666.jpg inflating: test_images/8555.jpg inflating: test_images/6578.jpg inflating: test_images/1217.jpg inflating: test_images/783.jpg inflating: test_images/1571.jpg inflating: test_images/7100.jpg inflating: test_images/8233.jpg inflating: test_images/3366.jpg inflating: test_images/4409.jpg inflating: test_images/5717.jpg inflating: test_images/2078.jpg inflating: test_images/1559.jpg inflating: test_images/9105.jpg inflating: test_images/6236.jpg inflating: test_images/7128.jpg inflating: test_images/4421.jpg inflating: test_images/2050.jpg inflating: test_images/2736.jpg inflating: test_images/5059.jpg inflating: test_images/4347.jpg inflating: test_images/3428.jpg inflating: test_images/973.jpg inflating: test_images/9663.jpg inflating: test_images/6550.jpg inflating: test_images/7896.jpg inflating: test_images/967.jpg inflating: test_images/8569.jpg inflating: test_images/9677.jpg inflating: test_images/6544.jpg inflating: test_images/7882.jpg inflating: test_images/2722.jpg inflating: test_images/4353.jpg inflating: test_images/4435.jpg inflating: test_images/2044.jpg inflating: test_images/9111.jpg inflating: test_images/6222.jpg inflating: test_images/5918.jpg inflating: test_images/2911.jpg inflating: test_images/7869.jpg inflating: test_images/754.jpg inflating: test_images/9844.jpg inflating: test_images/8582.jpg inflating: test_images/9688.jpg inflating: test_images/740.jpg inflating: test_images/9850.jpg inflating: test_images/8596.jpg inflating: test_images/998.jpg inflating: test_images/2905.jpg inflating: test_images/2093.jpg inflating: test_images/5924.jpg inflating: test_images/6593.jpg inflating: test_images/768.jpg inflating: test_images/7855.jpg inflating: test_images/9878.jpg inflating: test_images/4384.jpg inflating: test_images/4390.jpg inflating: test_images/2939.jpg inflating: test_images/6587.jpg inflating: test_images/7841.jpg inflating: test_images/7699.jpg inflating: test_images/2087.jpg inflating: test_images/5930.jpg inflating: test_images/3399.jpg inflating: test_images/7698.jpg inflating: test_images/7840.jpg inflating: test_images/6586.jpg inflating: test_images/2938.jpg inflating: test_images/4391.jpg inflating: test_images/3398.jpg inflating: test_images/5931.jpg inflating: test_images/2086.jpg inflating: test_images/5925.jpg inflating: test_images/2092.jpg inflating: test_images/4385.jpg inflating: test_images/9879.jpg inflating: test_images/7854.jpg inflating: test_images/769.jpg inflating: test_images/6592.jpg inflating: test_images/2904.jpg inflating: test_images/8597.jpg inflating: test_images/999.jpg inflating: test_images/9851.jpg inflating: test_images/741.jpg inflating: test_images/9689.jpg inflating: test_images/5919.jpg inflating: test_images/8583.jpg inflating: test_images/9845.jpg inflating: test_images/755.jpg inflating: test_images/7868.jpg inflating: test_images/2910.jpg inflating: test_images/4352.jpg inflating: test_images/2723.jpg inflating: test_images/7883.jpg inflating: test_images/6545.jpg inflating: test_images/9676.jpg inflating: test_images/966.jpg inflating: test_images/8568.jpg inflating: test_images/6223.jpg inflating: test_images/9110.jpg inflating: test_images/2045.jpg inflating: test_images/4434.jpg inflating: test_images/2051.jpg inflating: test_images/4420.jpg inflating: test_images/7129.jpg inflating: test_images/6237.jpg inflating: test_images/9104.jpg inflating: test_images/1558.jpg inflating: test_images/7897.jpg inflating: test_images/6551.jpg inflating: test_images/9662.jpg inflating: test_images/972.jpg inflating: test_images/3429.jpg inflating: test_images/4346.jpg inflating: test_images/5058.jpg inflating: test_images/2737.jpg inflating: test_images/782.jpg inflating: test_images/1216.jpg inflating: test_images/6579.jpg inflating: test_images/8554.jpg inflating: test_images/7667.jpg inflating: test_images/9892.jpg inflating: test_images/3401.jpg inflating: test_images/5070.jpg inflating: test_images/2079.jpg inflating: test_images/5716.jpg inflating: test_images/4408.jpg inflating: test_images/3367.jpg inflating: test_images/8232.jpg inflating: test_images/7101.jpg inflating: test_images/1570.jpg inflating: test_images/8226.jpg inflating: test_images/7115.jpg inflating: test_images/9138.jpg inflating: test_images/1564.jpg inflating: test_images/5702.jpg inflating: test_images/3373.jpg inflating: test_images/3415.jpg inflating: test_images/5064.jpg inflating: test_images/1202.jpg inflating: test_images/796.jpg inflating: test_images/8540.jpg inflating: test_images/7673.jpg inflating: test_images/9886.jpg inflating: test_images/1955.jpg inflating: test_images/219.jpg inflating: test_images/2284.jpg inflating: test_images/4193.jpg inflating: test_images/3824.jpg inflating: test_images/6784.jpg inflating: test_images/8971.jpg inflating: test_images/6790.jpg inflating: test_images/8965.jpg inflating: test_images/6948.jpg inflating: test_images/4187.jpg inflating: test_images/5299.jpg inflating: test_images/3830.jpg inflating: test_images/4839.jpg inflating: test_images/2290.jpg inflating: test_images/1941.jpg inflating: test_images/1799.jpg inflating: test_images/4811.jpg inflating: test_images/1969.jpg inflating: test_images/225.jpg inflating: test_images/543.jpg inflating: test_images/6960.jpg inflating: test_images/8795.jpg inflating: test_images/3818.jpg inflating: test_images/8959.jpg inflating: test_images/557.jpg inflating: test_images/6974.jpg inflating: test_images/8781.jpg inflating: test_images/231.jpg inflating: test_images/4805.jpg inflating: test_images/3159.jpg inflating: test_images/4636.jpg inflating: test_images/5528.jpg inflating: test_images/2247.jpg inflating: test_images/9312.jpg inflating: test_images/6021.jpg inflating: test_images/1996.jpg inflating: test_images/7459.jpg inflating: test_images/9474.jpg inflating: test_images/6747.jpg inflating: test_images/1028.jpg inflating: test_images/2521.jpg inflating: test_images/4150.jpg inflating: test_images/2535.jpg inflating: test_images/4144.jpg inflating: test_images/9460.jpg inflating: test_images/6753.jpg inflating: test_images/9306.jpg inflating: test_images/6035.jpg inflating: test_images/1982.jpg inflating: test_images/8018.jpg inflating: test_images/4622.jpg inflating: test_images/2253.jpg inflating: test_images/1772.jpg inflating: test_images/7303.jpg inflating: test_images/8030.jpg inflating: test_images/3165.jpg inflating: test_images/5514.jpg inflating: test_images/5272.jpg inflating: test_images/3603.jpg inflating: test_images/7465.jpg inflating: test_images/8756.jpg inflating: test_images/9448.jpg inflating: test_images/1014.jpg inflating: test_images/580.jpg inflating: test_images/7471.jpg inflating: test_images/8742.jpg inflating: test_images/594.jpg inflating: test_images/1000.jpg inflating: test_images/2509.jpg inflating: test_images/5266.jpg inflating: test_images/4178.jpg inflating: test_images/3617.jpg inflating: test_images/3171.jpg inflating: test_images/5500.jpg inflating: test_images/1766.jpg inflating: test_images/6009.jpg inflating: test_images/7317.jpg inflating: test_images/8024.jpg inflating: test_images/6196.jpg inflating: test_images/1821.jpg inflating: test_images/7288.jpg inflating: test_images/4781.jpg inflating: test_images/4959.jpg inflating: test_images/3950.jpg inflating: test_images/2496.jpg inflating: test_images/3788.jpg inflating: test_images/6828.jpg inflating: test_images/8805.jpg inflating: test_images/8811.jpg inflating: test_images/3944.jpg inflating: test_images/2482.jpg inflating: test_images/4795.jpg inflating: test_images/6182.jpg inflating: test_images/379.jpg inflating: test_images/1835.jpg inflating: test_images/4965.jpg inflating: test_images/9299.jpg inflating: test_images/351.jpg inflating: test_images/8187.jpg inflating: test_images/6814.jpg inflating: test_images/8839.jpg inflating: test_images/437.jpg inflating: test_images/3978.jpg inflating: test_images/6800.jpg inflating: test_images/423.jpg inflating: test_images/345.jpg inflating: test_images/1809.jpg inflating: test_images/8193.jpg inflating: test_images/4971.jpg inflating: test_images/2333.jpg inflating: test_images/4742.jpg inflating: test_images/8178.jpg inflating: test_images/9266.jpg inflating: test_images/6155.jpg inflating: test_images/9500.jpg inflating: test_images/6633.jpg inflating: test_images/4024.jpg inflating: test_images/3993.jpg inflating: test_images/2455.jpg inflating: test_images/4030.jpg inflating: test_images/3987.jpg inflating: test_images/2441.jpg inflating: test_images/1148.jpg inflating: test_images/9514.jpg inflating: test_images/6627.jpg inflating: test_images/7539.jpg inflating: test_images/9272.jpg inflating: test_images/6141.jpg inflating: test_images/2327.jpg inflating: test_images/5448.jpg inflating: test_images/4756.jpg inflating: test_images/3039.jpg inflating: test_images/7277.jpg inflating: test_images/8144.jpg inflating: test_images/6169.jpg inflating: test_images/1606.jpg inflating: test_images/392.jpg inflating: test_images/5460.jpg inflating: test_images/3011.jpg inflating: test_images/3777.jpg inflating: test_images/4018.jpg inflating: test_images/5306.jpg inflating: test_images/2469.jpg inflating: test_images/1160.jpg inflating: test_images/7511.jpg inflating: test_images/8622.jpg inflating: test_images/1174.jpg inflating: test_images/9528.jpg inflating: test_images/7505.jpg inflating: test_images/8636.jpg inflating: test_images/3763.jpg inflating: test_images/5312.jpg inflating: test_images/5474.jpg inflating: test_images/3005.jpg inflating: test_images/7263.jpg inflating: test_images/8150.jpg inflating: test_images/386.jpg inflating: test_images/1612.jpg inflating: test_images/91.jpg inflating: test_images/7934.jpg inflating: test_images/609.jpg inflating: test_images/9919.jpg inflating: test_images/2694.jpg inflating: test_images/5845.jpg inflating: test_images/4583.jpg inflating: test_images/6394.jpg inflating: test_images/6380.jpg inflating: test_images/5689.jpg inflating: test_images/5851.jpg inflating: test_images/4597.jpg inflating: test_images/2858.jpg inflating: test_images/2680.jpg inflating: test_images/85.jpg inflating: test_images/7920.jpg inflating: test_images/1389.jpg inflating: test_images/2870.jpg inflating: test_images/635.jpg inflating: test_images/7908.jpg inflating: test_images/9925.jpg inflating: test_images/8385.jpg inflating: test_images/153.jpg inflating: test_images/5879.jpg inflating: test_images/8391.jpg inflating: test_images/147.jpg inflating: test_images/621.jpg inflating: test_images/9931.jpg inflating: test_images/2864.jpg inflating: test_images/2657.jpg inflating: test_images/5138.jpg inflating: test_images/4226.jpg inflating: test_images/3549.jpg inflating: test_images/812.jpg inflating: test_images/6431.jpg inflating: test_images/52.jpg inflating: test_images/9702.jpg inflating: test_images/1438.jpg inflating: test_images/6357.jpg inflating: test_images/9064.jpg inflating: test_images/7049.jpg inflating: test_images/5886.jpg inflating: test_images/0.jpg inflating: __MACOSX/test_images/._0.jpg inflating: test_images/4540.jpg inflating: test_images/2131.jpg inflating: test_images/5892.jpg inflating: test_images/4554.jpg inflating: test_images/2125.jpg inflating: test_images/6343.jpg inflating: test_images/9070.jpg inflating: test_images/806.jpg inflating: test_images/8408.jpg inflating: test_images/46.jpg inflating: test_images/6425.jpg inflating: test_images/9716.jpg inflating: test_images/2643.jpg inflating: test_images/4232.jpg inflating: test_images/8420.jpg inflating: test_images/7713.jpg inflating: test_images/1362.jpg inflating: test_images/5104.jpg inflating: test_images/3575.jpg inflating: test_images/3213.jpg inflating: test_images/5662.jpg inflating: test_images/190.jpg inflating: test_images/1404.jpg inflating: test_images/9058.jpg inflating: test_images/8346.jpg inflating: test_images/7075.jpg inflating: test_images/1410.jpg inflating: test_images/184.jpg inflating: test_images/8352.jpg inflating: test_images/7061.jpg inflating: test_images/3207.jpg inflating: test_images/4568.jpg inflating: test_images/5676.jpg inflating: test_images/2119.jpg inflating: test_images/5110.jpg inflating: test_images/3561.jpg inflating: test_images/8434.jpg inflating: test_images/7707.jpg inflating: test_images/6419.jpg inflating: test_images/1376.jpg
In [4]
!cd ~/data/data62842/ && mv train_images ../ && mv train_label.csv ../
!cd ~/data/data62843/ && mv test_images ../
2.2 数据增强
-
本次比赛中,使用数据增强的目的是用来防止过拟合,并且数据增强适用于dataset较小的时候。
-
我选择使用text_render进行数据增强。使用的操作主要包括明暗变换,文本边界调整,添加噪声,颜色调整,文本字体特效变换等等。
-
安装text_render后,需要手动修改text_render/configs/default.yaml配置,如下所示
# Small font_size will make text looks like blured/prydown
font_size:
min: 14
max: 23
# choose Text color range
# color boundary is in R,G,B format
font_color:
enable: true
blue:
fraction: 0.5
l_boundary: [0,0,150]
h_boundary: [60,60,255]
brown:
fraction: 0.5
l_boundary: [139,70,19]
h_boundary: [160,82,43]
# By default, text is drawed by Pillow with (https://stackoverflow.com/questions/43828955/measuring-width-of-text-python-pil)
# If `random_space` is enabled, some text will be drawed char by char with a random space
random_space:
enable: false
fraction: 0.3
min: -0.1 # -0.1 will make chars very close or even overlapped
max: 0.1
# Do remap with sin()
# Currently this process is very slow!
curve:
enable: false
fraction: 0.3
period: 360 # degree, sin 函数的周期
min: 1 # sin 函数的幅值范围
max: 5
# random crop text height
crop:
enable: false
fraction: 0.5
# top and bottom will applied equally
top:
min: 5
max: 10 # in pixel, this value should small than img_height
bottom:
min: 5
max: 10 # in pixel, this value should small than img_height
# Use image in bg_dir as background for text
img_bg:
enable: false
fraction: 0.5
# Not work when random_space applied
text_border:
enable: true
fraction: 0.3
# lighter than word color
light:
enable: true
fraction: 0.5
# darker than word color
dark:
enable: true
fraction: 0.5
# https://docs.opencv.org/3.4/df/da0/group__photo__clone.html#ga2bf426e4c93a6b1f21705513dfeca49d
# https://www.cs.virginia.edu/~connelly/class/2014/comp_photo/proj2/poisson.pdf
# Use opencv seamlessClone() to draw text on background
# For some background image, this will make text image looks more real
seamless_clone:
enable: true
fraction: 0.5
perspective_transform:
max_x: 25
max_y: 25
max_z: 3
blur:
enable: true
fraction: 0.03
# If an image is applied blur, it will not be applied prydown
prydown:
enable: true
fraction: 0.03
max_scale: 1.5 # Image will first resize to 1.5x, and than resize to 1x
noise:
enable: true
fraction: 0.3
gauss:
enable: true
fraction: 0.25
uniform:
enable: true
fraction: 0.25
salt_pepper:
enable: true
fraction: 0.25
poisson:
enable: true
fraction: 0.25
line:
enable: false
fraction: 0.05
under_line:
enable: false
fraction: 0.2
table_line:
enable: false
fraction: 0.3
middle_line:
enable: false
fraction: 0.5
line_color:
enable: false
black:
fraction: 0.5
l_boundary: 0,0,0
h_boundary: 64,64,64
blue:
fraction: 0.5
l_boundary: [0,0,150]
h_boundary: [60,60,255]
# These operates are applied on the final output image,
# so actually it can also be applied in training process as an data augmentation method.
# By default, text is darker than background.
# If `reverse_color` is enabled, some images will have dark background and light text
reverse_color:
enable: true
fraction: 0.3
emboss:
enable: true
fraction: 0.3
sharp:
enable: true
fraction: 0.3
- PaddleOCR的FAQ1.1.8中介绍到,PaddleOCR的识别模型采用520W左右的数据集(真实数据26W+合成数据500W)进行训练,可见数据增广的重要性。
In [5]
!cd ~/work && git clone https://github.com/Sanster/text_renderer
!cd ~/work/text_renderer && pip install -r requirements.txt
Cloning into 'text_renderer'... remote: Enumerating objects: 707, done. remote: Counting objects: 100% (19/19), done. remote: Compressing objects: 100% (17/17), done. remote: Total 707 (delta 4), reused 7 (delta 2), pack-reused 688 Receiving objects: 100% (707/707), 12.92 MiB | 29.00 KiB/s, done. Resolving deltas: 100% (387/387), done. Checking connectivity... done. Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: Cython in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 1)) (0.29) Requirement already satisfied: opencv-python in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 2)) (4.2.0.32) Requirement already satisfied: pillow in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (7.1.2) Requirement already satisfied: numpy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 4)) (1.21.6) Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 5)) (2.2.3) Collecting fontTools Downloading https://pypi.tuna.tsinghua.edu.cn/packages/bc/83/43991c6f0dfb395cc9ccf5c19fd51fc6068cb3919cee4b78eddd4b16efd1/fonttools-4.32.0-py3-none-any.whl (900 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 900.8/900.8 KB 20.2 MB/s eta 0:00:0000:01 Collecting tenacity Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f2/a5/f86bc8d67c979020438c8559cc70cfe3a1643fd160d35e09c9cca6a09189/tenacity-8.0.1-py3-none-any.whl (24 kB) Requirement already satisfied: easyDict in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 8)) (1.9) Collecting pyyaml==5.1 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9f/2c/9417b5c774792634834e730932745bc09a7d36754ca00acf1ccd1ac2594d/PyYAML-5.1.tar.gz (274 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 274.2/274.2 KB 24.9 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->-r requirements.txt (line 5)) (1.1.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->-r requirements.txt (line 5)) (3.0.7) Requirement already satisfied: pytz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->-r requirements.txt (line 5)) (2022.1) Requirement already satisfied: six>=1.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->-r requirements.txt (line 5)) (1.16.0) Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->-r requirements.txt (line 5)) (2.8.2) Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->-r requirements.txt (line 5)) (0.10.0) Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->-r requirements.txt (line 5)) (41.4.0) Building wheels for collected packages: pyyaml Building wheel for pyyaml (setup.py) ... done Created wheel for pyyaml: filename=PyYAML-5.1-cp37-cp37m-linux_x86_64.whl size=44074 sha256=daf0a0735387c6b458dfe6ea44d8171e841f474a5b6c03b02e6e466946b4b4e3 Stored in directory: /home/aistudio/.cache/pip/wheels/94/62/26/bddefc8ed4a42614da5668d92d375be4e5c7818a4847ff2a21 Successfully built pyyaml Installing collected packages: tenacity, pyyaml, fontTools Attempting uninstall: pyyaml Found existing installation: PyYAML 5.1.2 Uninstalling PyYAML-5.1.2: Successfully uninstalled PyYAML-5.1.2 Successfully installed fontTools-4.32.0 pyyaml-5.1 tenacity-8.0.1
- 通过统计训练集的图像尺寸,可以发现训练集的高度固定为48,而宽度与图中的文字个数有关。
In [ ]
import glob
import os
import cv2
def get_aspect_ratio(img_set_dir):
m_width = 0
m_height = 0
width_dict = {}
height_dict = {}
images = glob.glob(img_set_dir+'*.jpg')
for image in images:
img = cv2.imread(image)
width_dict[int(img.shape[1])] = 1 if (int(img.shape[1])) not in width_dict else 1 + width_dict[int(img.shape[1])]
height_dict[int(img.shape[0])] = 1 if (int(img.shape[0])) not in height_dict else 1 + height_dict[int(img.shape[0])]
m_width += img.shape[1]
m_height += img.shape[0]
m_width = m_width/len(images)
m_height = m_height/len(images)
aspect_ratio = m_width/m_height
width_dict = dict(sorted(width_dict.items(), key=lambda item: item[1], reverse=True))
height_dict = dict(sorted(height_dict.items(), key=lambda item: item[1], reverse=True))
return aspect_ratio,m_width,m_height,width_dict,height_dict
aspect_ratio,m_width,m_height,width_dict,height_dict = get_aspect_ratio("/home/aistudio/data/train_images/")
print("aspect ratio is: {}, mean width is: {}, mean height is: {}".format(aspect_ratio,m_width,m_height))
print("Width dict:{}".format(width_dict))
print("Height dict:{}".format(height_dict))
import pandas as pd
def Q2B(s):
"""全角转半角"""
inside_code=ord(s)
if inside_code==0x3000:
inside_code=0x0020
else:
inside_code-=0xfee0
if inside_code<0x0020 or inside_code>0x7e: #转完之后不是半角字符返回原来的字符
return s
return chr(inside_code)
def stringQ2B(s):
"""把字符串全角转半角"""
return "".join([Q2B(c) for c in s])
def is_chinese(s):
"""判断unicode是否是汉字"""
for c in s:
if c < u'\u4e00' or c > u'\u9fa5':
return False
return True
def is_number(s):
"""判断unicode是否是数字"""
for c in s:
if c < u'\u0030' or c > u'\u0039':
return False
return True
def is_alphabet(s):
"""判断unicode是否是英文字母"""
for c in s:
if c < u'\u0061' or c > u'\u007a':
return False
return True
def del_other(s):
"""判断是否非汉字,数字和小写英文"""
res = str()
for c in s:
if not (is_chinese(c) or is_number(c) or is_alphabet(c)):
c = ""
res += c
return res
df = pd.read_csv("/home/aistudio/data/train_label.csv", encoding="gbk")
name, value = list(df.name), list(df.value)
for i, label in enumerate(value):
# 全角转半角
label = stringQ2B(label)
# 大写转小写
label = "".join([c.lower() for c in label])
# 删除所有空格符号
label = del_other(label)
value[i] = label
# 删除标签为""的行
data = zip(name, value)
data = list(filter(lambda c: c[1]!="", list(data)))
# 保存到work目录
with open("/home/aistudio/data/train_label.txt", "w") as f:
for line in data:
f.write(line[0] + "\t" + line[1] + "\n")
# 记录训练集中最长标签
label_max_len = 0
with open("/home/aistudio/data/train_label.txt", "r") as f:
for line in f:
name, label = line.strip().split("\t")
if len(label) > label_max_len:
label_max_len = len(label)
print("label max len: ", label_max_len)
def create_label_list(train_list):
classSet = set()
with open(train_list) as f:
next(f)
for line in f:
img_name, label = line.strip().split("\t")
for e in label:
classSet.add(e)
# 在类的基础上加一个blank
classList = sorted(list(classSet))
with open("/home/aistudio/data/label_list.txt", "w") as f:
for idx, c in enumerate(classList):
f.write("{}\t{}\n".format(c, idx))
# 为数据增广提供词库
with open("/home/aistudio/work/text_renderer/data/chars/ch.txt", "w") as f:
for idx, c in enumerate(classList):
f.write("{}\n".format(c))
return classSet
classSet = create_label_list("/home/aistudio/data/train_label.txt")
print("classify num: ", len(classSet))
- 生成字符长度为1,2,3,4,5的数据集各2000张,共10000张。
In [ ]
# 清空已经生成的数据集
!cd ~/work/text_renderer/output/default && rm ./*
In [ ]
!cd ~/work/text_renderer && python main.py --length 1 --img_width 32 --img_height 48 --chars_file "./data/chars/ch.txt" --corpus_mode 'random' --num_img 2000
!cd ~/work/text_renderer && python main.py --length 2 --img_width 64 --img_height 48 --chars_file "./data/chars/ch.txt" --corpus_mode 'random' --num_img 2000
!cd ~/work/text_renderer && python main.py --length 3 --img_width 96 --img_height 48 --chars_file "./data/chars/ch.txt" --corpus_mode 'random' --num_img 2000
!cd ~/work/text_renderer && python main.py --length 4 --img_width 128 --img_height 48 --chars_file "./data/chars/ch.txt" --corpus_mode 'random' --num_img 2000
!cd ~/work/text_renderer && python main.py --length 5 --img_width 160 --img_height 48 --chars_file "./data/chars/ch.txt" --corpus_mode 'random' --num_img 2000
- 将生成的数据集与原数据集合并
In [9]
!cp ~/work/text_renderer/output/default/*.jpg ~/data/train_images
In [10]
import os
with open('work/text_renderer/output/default/tmp_labels.txt','r',encoding='utf-8') as src_label:
with open('data/train_label.txt','a',encoding='utf-8') as dst_label:
lines = src_label.readlines()
for line in lines:
[img,text] = line.split(' ')
print('{}.jpg\t{}'.format(img,text),file=dst_label,end='')
三、模型调优
- 可以选择PaddleOCR提供的CRNN预训练模型,或其他模型
- 根据前面统计的训练集尺寸,将模型输入尺寸设置为高度48,宽度256
- 采用cosine_decay和warmup策略,加快模型收敛
-
CRNN模型是在2015年论文"An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition"[论文][代码]提出的,用于不定长序列的文本识别。
-
下图是CRNN模型的结构图,其主要由CNN层,RNN层及CTC翻译层三部分构成。其中CNN层从输入图片提取图片特征,原文使用的是VGG网络,而PaddleOCR使用的是ResNet34和MobileNetV3。大模型往往能取得更好的效果,因此本项目采用ResNet34作为baseline来改进。改进方向则为调整CNN的特征提取网络,尝试ResNet50及更深的结构。
In [ ]
!cd ~/work/PaddleOCR && mkdir pretrain_weights && cd pretrain_weights && wget https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar
In [13]
!cd ~/work/PaddleOCR/pretrain_weights && tar -xf ch_ppocr_server_v1.1_rec_pre.tar
-
在PaddleOCR/configs/rec中,分别添加训练配置文件 my_rec_ch_train.yml和my_rec_ch_reader.yml
-
本次比赛结果的调优过程:设定了161轮迭代(从epoch0到epoch160),初始学习率为0.0001,fc_decay为0.00001,l2学习率衰减为0.00001
-
为了适当提升学习速度,使用了cosine_decay和warmup。其中step_each_epoch为1000,warmup_minibatch为2000,衰减总轮数为161
-
经测试以上参数设定可以达到较好的结果
#my_rec_ch_train.yml
Global:
algorithm: CRNN
use_gpu: true
epoch_num: 161 #训练轮数
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/my_rec_ch
save_epoch_step: 20 #保存模型间隔轮数
eval_batch_step: 1000
train_batch_size_per_card: 256
test_batch_size_per_card: 128
image_shape: [3, 48, 256]
max_text_length: 80
character_type: ch
character_dict_path: ./ppocr/utils/ppocr_keys_v1.txt
loss_type: ctc
distort: true
use_space_char: true
reader_yml: ./configs/rec/my_rec_ch_reader.yml
pretrain_weights: ./pretrain_weights/ch_ppocr_server_v1.1_rec_pre/best_accuracy
checkpoints:
save_inference_dir:
infer_img:
Architecture:
function: ppocr.modeling.architectures.rec_model,RecModel
Backbone:
function: ppocr.modeling.backbones.rec_resnet_vd,ResNet
layers: 34
Head:
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
encoder_type: rnn
fc_decay: 0.00001
SeqRNN:
hidden_size: 256
Loss:
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
Optimizer:
function: ppocr.optimizer,AdamDecay
base_lr: 0.0001 #初始学习率
l2_decay: 0.00001 #学习率衰减
beta1: 0.9
beta2: 0.999
decay:
function: cosine_decay_warmup
step_each_epoch: 1000
total_epoch: 161
warmup_minibatch: 2000
#my_rec_ch_reader.yml
TrainReader:
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
num_workers: 1
img_set_dir: /home/aistudio/data/train_images
label_file_path: /home/aistudio/data/train_label.txt
EvalReader:
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
img_set_dir: /home/aistudio/data/train_images
label_file_path: /home/aistudio/data/train_label.txt
TestReader:
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
四、训练与预测
- 参考PaddleOCR官方教程进行模型训练
4.1 训练模型
- 添加训练配置文件 my_rec_ch_train.yml和my_rec_ch_reader.yml以后,输入以下命令就可以开始训练。
In [ ]
!cd ~/work/PaddleOCR && python tools/train.py -c configs/rec/my_rec_ch_train.yml
4.2 导出模型(根据需求选择)
- 训练完成后,模型和参数会被保存到PaddleOCR/output文件下,选择需要导出最终的模型,如下操作导出的是对iter_epoch_160的模型进行导出,同时设置导出的路径为PaddleOCR/inference/CRNN_R34,这些路径都可以自行修改。
In [ ]
!cd ~/work/PaddleOCR && python tools/export_model.py -c configs/rec/my_rec_ch_train.yml -o Global.checkpoints=./output/my_rec_ch/iter_epoch_160 Global.save_inference_dir=./inference/CRNN_R34
4.3 预测结果
在work/PaddleOCR/tools/路径下,新建python文件infer_rec_new.py
复制如下代码到infer_rec_new.py中
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import os
import sys
import glob
import re
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__)
sys.path.append(os.path.abspath(os.path.join(__dir__, '..')))
def set_paddle_flags(**kwargs):
for key, value in kwargs.items():
if os.environ.get(key, None) is None:
os.environ[key] = str(value)
# NOTE(paddle-dev): All of these flags should be
# set before `import paddle`. Otherwise, it would
# not take any effect.
set_paddle_flags(
FLAGS_eager_delete_tensor_gb=0, # enable GC to save memory
)
import tools.program as program
from paddle import fluid
from ppocr.utils.utility import initial_logger
logger = initial_logger()
from ppocr.utils.utility import enable_static_mode
from ppocr.data.reader_main import reader_main
from ppocr.utils.save_load import init_model
from ppocr.utils.character import CharacterOps
from ppocr.utils.utility import create_module
from ppocr.utils.utility import get_image_file_list
def main():
config = program.load_config(FLAGS.config)
program.merge_config(FLAGS.opt)
logger.info(config)
char_ops = CharacterOps(config['Global'])
config['Global']['char_ops'] = char_ops
# check if set use_gpu=True in paddlepaddle cpu version
use_gpu = config['Global']['use_gpu']
# check_gpu(use_gpu)
place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
rec_model = create_module(config['Architecture']['function'])(params=config)
startup_prog = fluid.Program()
eval_prog = fluid.Program()
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
_, outputs = rec_model(mode="test")
fetch_name_list = list(outputs.keys())
fetch_varname_list = [outputs[v].name for v in fetch_name_list]
print(fetch_varname_list)
eval_prog = eval_prog.clone(for_test=True)
exe.run(startup_prog)
init_model(config, eval_prog, exe)
blobs = reader_main(config, 'test')() ###
# print(blobs)
infer_img = config['Global']['infer_img']
infer_list = get_image_file_list(infer_img)
#infer_list.sort(key=lambda x: int(re.split('/home/aistudio/data/test_images/|.jpg',x)[1])) ##
# print(infer_list)
#images = glob.glob("/home/aistudio/data/test_images/*.jpg")
#images.sort(key=lambda x: int(re.split('/home/aistudio/data/test_images/|.jpg',x)[1]))
max_img_num = len(infer_list)
if len(infer_list) == 0:
logger.info("Can not find img in infer_img dir.")
from tqdm import tqdm
f = open('test2.txt',mode='w',encoding='utf8') ###
f.write('new_name\tvalue\n') ###
for i in tqdm( range(max_img_num)):
#for image in images:
# print("infer_img:",infer_list[i])
img = next(blobs)
predict = exe.run(program=eval_prog,
feed={"image": img},#img
fetch_list=fetch_varname_list,
return_numpy=False)
preds = np.array(predict[0])
if preds.shape[1] == 1:
preds = preds.reshape(-1)
preds_lod = predict[0].lod()[0]
preds_text = char_ops.decode(preds)
else:
end_pos = np.where(preds[0, :] == 1)[0]
if len(end_pos) <= 1:
preds_text = preds[0, 1:]
else:
preds_text = preds[0, 1:end_pos[1]]
preds_text = preds_text.reshape(-1)
preds_text = char_ops.decode(preds_text)
#f.write('{}\t{}\n'.format(os.path.basename(img_path),preds_text)) ###
f.write('{}\t{}\n'.format(infer_list[i].replace('/home/aistudio/data/test_images/', ''),preds_text))
#print(image)
# print("\t index:",preds)
# print("\t word :",preds_text)
f.close()
# save for inference model
#target_var = []
#for key, values in outputs.items():
# target_var.append(values)
#fluid.io.save_inference_model(
# "./output/",
# feeded_var_names=['image'],
# target_vars=target_var,
# executor=exe,
# main_program=eval_prog,
# model_filename="model",
# params_filename="params")
if __name__ == '__main__':
enable_static_mode()
parser = program.ArgsParser()
FLAGS = parser.parse_args()
FLAGS.config = 'configs/rec/my_rec_ch_train.yml'
main()
- 结果(.txt)将会被保存至work/PaddleOCR/的路径下,命名为test2.txt
注意:此处的test2.txt文件中的内容是乱序的,根据比赛要求,需要对其中的预测内容排序后再提交
- 最终比赛提交的结果,checkpoints使用的是/home/aistudio/work/PaddleOCR/output/my_rec_ch/路径下的best_accuracy
- 通过下面的命令即可对测试集图像进行预测
- Global.checkpoints 模型检查点文件
- -c 配置文件
- Global.infer_img 预测图片路径,可以为图像文件或者图像目录
In [ ]
%cd ~/work/PaddleOCR
!python tools/infer_rec_new.py \
-c configs/rec/my_rec_ch_train.yml \
-o Global.checkpoints=./output/my_rec_ch/best_accuracy \
Global.infer_img=/home/aistudio/data/test_images
4.4 对txt文件内容排序
- 用python写了一个小算法,对txt文件中的内容排序,最终将结果输出到test112.txt文件中,该排序文件命名为ZhuanHuan.py
- 在work/PaddleOCR/路径下,新建ZhuanHuan.py文件,复制以下代码到ZhuanHuan.py中。
f = open('test7.txt', 'r', encoding='utf8')
something = f.readlines()
#print(something)
new = []
for x in something:
first = x.strip('\n')
second = first.split()
new.append(second)
#print(new)
print(new[1][1])
#for i in new:
# print(new)
# new[i][0].replace('.jpg','')
# int(new[i][0])
#for i in range(1,10001):
# if new[i][1] == []:
# new[i][0] = new[i][0].replace('', ' ')
for i in range(1,10001):
new[i][0]=new[i][0].replace('.jpg', '')
new[i][0]=int(new[i][0])
#new[i][0]=int(new[i][0])
#new[i][0].sort()
print(new)
# for j in range(len(new[0])):
f = open('test1112.txt', mode='w', encoding='utf8') ###
f.write('new_name\tvalue\n') ###
b = 0
for j in range(10000):
for i in range(1,10001):
if new[i][0] == b:
if len(new[i]) == 2:
f.write('{}.jpg\t{}\n'.format(new[i][0], new[i][1]))
else:
f.write('{}.jpg\t{}\n'.format(new[i][0], ''))
b = b+1
print(j)
f.close()
print("finish")
In [ ]
!python /home/aistudio/work/PaddleOCR/ZhuanHuan.py
五、总结与展望
- 可以尝试进一步优化数据增强的配置文件中的参数
- 尝试调整超参数
- 每个神经网络的能力也是有限的,可以尝试改进网络模型
六、给其他选手的建议
-
有一定经验的小伙伴可以从竞赛入手锻炼自己的能力,在学习中可以多查阅Paddle官方API文档或者教程,有助于快速解决问题。
-
另外,可以多学习他人分享的项目,从中学习一些思路和调参经验。
-
对于没有经验的小伙伴,可以报名飞桨训练营和相关课程,可以很好的打下基础。
-
总之,要多学和多练相结合可以提升自我。
参考资料
更多推荐
所有评论(0)