[iFLYTEK]科大讯飞机动车车牌识别挑战赛
2022年科大讯飞机动车车牌识别挑战赛解决方案。卡口系统已被广泛应用于国道、省道、高速公路等应用场景,车牌识别技术结合ETC识别车辆,过往车辆通过道口时无须停车,即能够实现车辆识别
[iFLYTEK] 2022机动车车牌识别挑战赛
一、赛事背景
随着世界各国汽车数量的增加,城市交通状况日益受到人们重视,如何高效地进行交通管理,成为各国政府和有关部门关注的焦点,开发智能交通系统是大势所趋。车牌识别技术是智能交通的重要环节,其中较为典型的应用场景为卡口系统。
卡口系统已被广泛应用于国道、省道、高速公路等应用场景,车牌识别技术结合电子不停车收费系统(ETC)识别车辆,过往车辆通过道口时无须停车,即能够实现车辆身份自动识别、自动收费。车牌号码识别是否准确和高效直接影响了卡口系统的整体性能及其在更多场景的应用泛化。
比赛地址:
http://challenge.xfyun.cn/topic/info?type=license-plate-recognition
二、赛事任务
本次赛题需要选手对车牌字符进行识别,具体的案例如下:
三、评审规则
- 数据说明
训练集和测试集图片放在不同的文件夹,训练集文件名为训练图片标签。
- 评估指标
本次竞赛的评价标准采用准确率指标,最高分为1。
计算方法参考:https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html
四、解题思路
看到光学字符识别的问题,那就必须得想到我们PaddleOCR了。
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。
点个star不迷路
五、环境准备
本任务基于Aistudio完成, 具体环境如下:
操作系统: Linux
PaddlePaddle: 2.2.2
PaddleOCR: dygraph
# 下载paddleocr代码 大约一分钟就OK了
!git clone -b dygraph https://gitee.com/PaddlePaddle/PaddleOCR
Cloning into 'PaddleOCR'...
remote: Enumerating objects: 35352, done.[K
remote: Counting objects: 100% (9731/9731), done.[K
remote: Compressing objects: 100% (3798/3798), done.[K
remote: Total 35352 (delta 6760), reused 8572 (delta 5828), pack-reused 25621[K
Receiving objects: 100% (35352/35352), 313.53 MiB | 7.89 MiB/s, done.
Resolving deltas: 100% (24621/24621), done.
Checking connectivity... done.
import os
os.chdir('/home/aistudio/PaddleOCR')
!pip install -r requirements.txt -i https://mirror.baidu.com/pypi/simple
!pwd
/home/aistudio/PaddleOCR
六、数据集处理
主要就是为了生成最后需要使用的train_list.txt和val_list.txt
6.1 解压数据
在开始训练之前,可使用如下代码生成符合PP-OCR训练格式的标签文件。
# 解压数据 解压之后可能会乱码,点击左侧乱码的三个点,手动改一下即可 我这里重命名为了datasets
!unzip -oq /home/aistudio/data/data159025/机动车车牌识别挑战赛公开数据.zip
!unzip -oq /home/aistudio/datasets/test.zip -d /home/aistudio/datasets/
!unzip -oq /home/aistudio/datasets/train.zip -d /home/aistudio/datasets/
如下图所示:
去paddleocr官网查看,我们自己的数据集使用的话,需要怎么样去适配
6.2 总标签文件
我们这里就是为了生成
datasets/train/皖A7A699.jpg\t皖A7A699
这样的标签文件
# 制作标注文件: 例如:
# datasets/train/皖A7A699.jpg\t皖A7A699
import os
# -*- coding: utf-8 -*-
# 根据左侧生成的文件夹名字来写根目录
dirpath = "datasets/train/"
# 先得到总的txt后续再进行划分,因为要划分出验证集,所以要先打乱,因为原本是有序的
def get_all_txt():
all_list = []
i = 0 # 标记总文件数量
for root,dirs,files in os.walk(dirpath): # 分别代表根目录、文件夹、文件
for file in files:
i = i + 1
# 文件中每行格式: 图像相对路径 图像的label_id(数字类别)(注意:中间有\t)。
imgpath = os.path.join(root,file)
name = file.split('/')[-1].split('.')[0]
all_list.append(imgpath+"\t"+name+"\n")
allstr = ''.join(all_list)
f = open('datasets/all_list.txt','w',encoding='utf-8')
f.write(allstr)
return all_list , i
all_list,all_lenth = get_all_txt()
print(all_lenth)
21029
如下:
6.3 数据打乱
from sklearn.utils import shuffle
# 把数据打乱
all_list = shuffle(all_list)
allstr = ''.join(all_list)
f = open('datasets/all_list.txt','w',encoding='utf-8')
f.write(allstr)
print("打乱成功,并重新写入文本")
打乱成功,并重新写入文本
6.4 数据划分
# 按照比例划分数据集 食品的数据有21029张图片
train_size = int(all_lenth * 0.85)
train_list = all_list[:train_size]
val_list = all_list[train_size:]
print(len(train_list))
print(len(val_list))
17874
3155
# 运行cell,生成训练集txt
train_txt = ''.join(train_list)
f_train = open('datasets/train_list.txt','w',encoding='utf-8')
f_train.write(train_txt)
f_train.close()
print("datasets/train_list.txt 生成成功!")
# 运行cell,生成验证集txt
val_txt = ''.join(val_list)
f_val = open('datasets/val_list.txt','w',encoding='utf-8')
f_val.write(val_txt)
f_val.close()
print("datasets/val_list.txt 生成成功!")
datasets/train_list.txt 生成成功!
datasets/val_list.txt 生成成功!
# 移动文件夹到PaddleOCR下面
%cd /home/aistudio/
!mv /home/aistudio/datasets/ /home/aistudio/PaddleOCR/
七、 模型准备及训练
7.1 下载预训练模型
下载预训练模型,本次比赛是支持使用预训练模型的
这里采用官网提供的,ch_PP-OCRv3_rec_train.tar
%cd /home/aistudio/PaddleOCR/
# 下载预训练模型
!wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar
# 解压模型参数
%cd pretrain_models
!tar -xf ch_PP-OCRv3_rec_train.tar && rm -rf ch_PP-OCRv3_rec_train.tar
/home/aistudio/PaddleOCR
--2022-07-25 09:59:13-- https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar
Resolving paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)... 100.67.200.6
Connecting to paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)|100.67.200.6|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 287467520 (274M) [application/x-tar]
Saving to: ‘./pretrain_models/ch_PP-OCRv3_rec_train.tar’
ch_PP-OCRv3_rec_tra 100%[===================>] 274.15M 122MB/s in 2.3s
2022-07-25 09:59:15 (122 MB/s) - ‘./pretrain_models/ch_PP-OCRv3_rec_train.tar’ saved [287467520/287467520]
/home/aistudio/PaddleOCR/pretrain_models
7.2 字典生成
因为是车牌识别,其实,就是汉字加英文和字母,那我们就需要把他变成一个txt文件
每个车牌号码由一个汉字,一个字母和五个字母或数字组成。有效的中文车牌由七个字符组成:省(1个字符),字母(1个字符),字母+数字(5个字符)。这三个数组定义如下。每个数组的最后一个字符是字母O,而不是数字0。我们将O用作“无字符”的符号,因为中文车牌字符中没有O。因此以上车牌拼起来即为车牌。
provinces = ["皖", "沪", "津", "渝", "冀", "晋", "蒙", "辽", "吉", "黑", "苏", "浙", "京", "闽", "赣", "鲁", "豫", "鄂", "湘", "粤", "桂", "琼", "川", "贵", "云", "藏", "陕", "甘", "青", "宁", "新", "警", "学", "O"]
alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W','X', 'Y', 'Z', 'O']
ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X','Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O']
provinces = ["皖", "沪", "津", "渝", "冀", "晋", "蒙", "辽", "吉", "黑", "苏", "浙", "京", "闽", "赣", "鲁", "豫", "鄂", "湘", "粤", "桂", "琼", "川", "贵", "云", "藏", "陕", "甘", "青", "宁", "新", "警", "学", "O"]
alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W','X', 'Y', 'Z', 'O']
ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X','Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O']
dict_list = provinces + alphabets + ads
# 运行cell,生成训练集txt
dict_txt = '\n'.join(dict_list)
f_train = open('/home/aistudio/PaddleOCR/datasets/dict_list.txt','w',encoding='utf-8')
f_train.write(dict_txt)
f_train.close()
print("datasets/dict_list.txt 生成成功!")
datasets/dict_list.txt 生成成功!
7.3 开始训练
# 确保在paddleocr路径下
%cd /home/aistudio/PaddleOCR/
/home/aistudio/PaddleOCR
yml 配置文件的编写
我这里采用官方推荐的yml文件进行编写
PaddleOCR/configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml
Global:
debug: false
use_gpu: true
epoch_num: 800
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_ppocr_v3_distillation
save_epoch_step: 5
eval_batch_step: [0, 2000]
cal_metric_during_train: true
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: True
infer_img: /home/aistudio/PaddleOCR/datasets/test/000026.jpg
character_dict_path: /home/aistudio/PaddleOCR/datasets/dict_list.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: false
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3_distillation.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Piecewise
decay_epochs : [700, 800]
values : [0.0005, 0.00005]
warmup_epoch: 5
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: &model_type "rec"
name: DistillationModel
algorithm: Distillation
Models:
Teacher:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Student:
pretrained:
freeze_params: false
return_all_feats: true
model_type: *model_type
algorithm: SVTR
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Loss:
name: CombinedLoss
loss_config_list:
- DistillationDMLLoss:
weight: 1.0
act: "softmax"
use_log: true
model_name_pairs:
- ["Student", "Teacher"]
key: head_out
multi_head: True
dis_head: ctc
name: dml_ctc
- DistillationDMLLoss:
weight: 0.5
act: "softmax"
use_log: true
model_name_pairs:
- ["Student", "Teacher"]
key: head_out
multi_head: True
dis_head: sar
name: dml_sar
- DistillationDistanceLoss:
weight: 1.0
mode: "l2"
model_name_pairs:
- ["Student", "Teacher"]
key: backbone_out
- DistillationCTCLoss:
weight: 1.0
model_name_list: ["Student", "Teacher"]
key: head_out
multi_head: True
- DistillationSARLoss:
weight: 1.0
model_name_list: ["Student", "Teacher"]
key: head_out
multi_head: True
PostProcess:
name: DistillationCTCLabelDecode
model_name: ["Student", "Teacher"]
key: head_out
multi_head: True
Metric:
name: DistillationMetric
base_metric_name: RecMetric
main_indicator: acc
key: "Student"
ignore_space: False
Train:
dataset:
name: SimpleDataSet
data_dir: /home/aistudio/PaddleOCR/
ext_op_transform_idx: 1
label_file_list:
- /home/aistudio/PaddleOCR/datasets/train_list.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecConAug:
prob: 0.5
ext_data_num: 2
image_shape: [48, 64, 3]
- RecAug:
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 64]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: true
batch_size_per_card: 256
drop_last: true
num_workers: 0
Eval:
dataset:
name: SimpleDataSet
data_dir: /home/aistudio/PaddleOCR/
label_file_list:
- /home/aistudio/PaddleOCR/datasets/val_list.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 64]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 256
num_workers: 0
# 训练 训练日志会自动保存为 "{save_model_dir}" 下的train.log
# 识别模型训练
# GPU训练 支持单卡,多卡训练,通过CUDA_VISIBLE_DEVICES指定卡号
%env CUDA_VISIBLE_DEVICES=0
!python3 tools/train.py \
-c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml \
-o Global.pretrained_model=./pretrain_models/ch_PP-OCRv3_rec_train/best_accuracy
7.4 训练日志可视化
经典版环境不支持看可视化,这里在本地进行可视化的查看。
在本地打开anaconda的终端环境:
(paddle-cpu) C:\Users\Administrator>pip install visualdl
(paddle-cpu) C:\Users\Administrator>d:
执行可视化命令会出现:
(paddle-cpu) D:\>visualdl --logdir d:Desktop --port 8080
VisualDL 2.2.3
Running VisualDL at http://localhost:8080/ (Press CTRL+C to quit)
Serving VisualDL on localhost; to expose to the network, use a proxy or pass --host 0.0.0.0
在本地打开上面的网址即可:
八、模型评估与预测
8.1 模型评估
训练中模型参数默认保存在Global.save_model_dir目录下
在评估指标时,需要设置Global.checkpoints指向保存的参数文件
评估数据集可以通过 yml 修改Eval中的 label_file_path 设置
# GPU 评估, Global.checkpoints 为待测权重
!python3 tools/eval.py \
-c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml \
-o Global.checkpoints=./output/rec_ppocr_v3_distillation/best_accuracy
[2022/07/27 09:27:20] ppocr INFO: start_epoch:174
eval model:: 100%|██████████████████████████████| 25/25 [00:03<00:00, 6.31it/s]
[2022/07/27 09:27:24] ppocr INFO: metric eval ***************
[2022/07/27 09:27:24] ppocr INFO: acc:0.8988906469131833
[2022/07/27 09:27:24] ppocr INFO: norm_edit_dis:0.9790864841052274
[2022/07/27 09:27:24] ppocr INFO: Teacher_acc:0.9103011064649726
[2022/07/27 09:27:24] ppocr INFO: Teacher_norm_edit_dis:0.9817636405377652
[2022/07/27 09:27:24] ppocr INFO: fps:3581.434800871256
8.2 单张预测
从我们的test中选出一张比较清晰的进行一个展示
# 预测
!python3 tools/infer_rec.py \
-c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml \
-o Global.pretrained_model=./output/rec_ppocr_v3_distillation/best_accuracy \
-o Global.infer_img=datasets/train/皖AM5989.jpg
8.3 批量预测
选择test文件夹,进行批量预测
# 预测
!python3 tools/infer_rec.py \
-c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml \
-o Global.pretrained_model=./output/rec_ppocr_v3_distillation/best_accuracy \
-o Global.infer_img=./datasets/test/
九、结果保存
预测的结果会生成在
PaddleOCR/output/rec/predicts_ppocrv3_distillation.txt
import json
# 数据结果的txt是这样的,那么,我们只需要将其提取,转换成大赛需要的格式即可。
# 找出规律, 图片路径 {"Student": {"label": "闽XXXXX", "score": 0.88}, "Teacher": {"label": "闽XXXXX", "score": 0.88}}紧接着下一个一样的数据循环
path = '/home/aistudio/PaddleOCR/output/rec/predicts_ppocrv3_distillation.txt'
with open(path,encoding='utf-8') as f:
content=f.read()
list_all = content.split('./')
path_list = []
label_list = []
for i in range(1,len(list_all)):
data = list_all[i]
onepath = data.split('\t')[0].split('datasets/test/')[-1]
set_ = data.split('\t')[-1]
set_ = json.loads(set_)
s = set_["Student"]["label"]
t = set_["Teacher"]["label"]
label = 0.0
if (float(set_["Student"]["score"]) >= float(set_["Teacher"]["score"])):
label = s
else:
label = t
path_list.append(onepath)
label_list.append(label)
print(path_list[0])
print(label_list[0])
print(len(path_list))
print(len(label_list))
000001.jpg
E
5000
5000
import pandas as pd
#字典中的key值即为csv中列名
dataframe = pd.DataFrame({'path':path_list,'label':label_list})
#将DataFrame存储为csv,index表示是否显示行名,default=True
#字典中的key值即为csv中列名
dataframe = pd.DataFrame({'path':path_list,'label':label_list})
#将DataFrame存储为csv,index表示是否显示行名,default=True
dataframe.to_csv("/home/aistudio/test.csv",index=False,sep=',')
项目总结
本项目借助PaddleOCR去实现了机动车牌分类。
个人总结
全网同名:
iterhui
我在AI Studio上获得至尊等级,点亮10个徽章,来互关呀~
https://aistudio.baidu.com/aistudio/personalcenter/thirdview/643467
此文仅为搬运,原作链接:https://aistudio.baidu.com/aistudio/projectdetail/4357352?contributionType=1
更多推荐
所有评论(0)