【算力加油包】为助力大家fork运行本项目,免费GPU算力,点击领取:https://aistudio.baidu.com/aistudio/competition/detail/473/0/introduction

0 赛事背景说明

近年来,AI+工业瑕疵检测已成为工业智能领域的重要应用场景,能够进一步提升工业检测效率和精度、降低人力成本。本赛题选取齿轮配件异常检测作为AI+工业瑕疵检测比赛场景,鼓励选手通过机器视觉技术提升齿轮异常检测速度和准确率。

齿轮配件异常检测是工业瑕疵检测的痛点场景。齿轮传动装置是机械装备的重要基础件,与带链、摩擦、液压等传动方式相比,具有功率范围大、传动效率高、运动平稳、传动比准确、使用寿命长、结构紧凑等特点,同时安全、可靠、性价比优越的优点,决定了它在通用机械装备领域中的不可替代性。齿轮作为一种典型的动力传递器件,其质量的好坏直接影响着机械产品性能。

目前机器视觉行业依然由少数国际龙头垄断。美国康耐视(cognex)及日本基恩士(Keyence)几乎垄断全球 50%以上的视觉检测市场,两者均基于核心零部件和技术(操作系统、传感器等)提供相应解决方案。国内机器视觉检测方案虽已有长足发展,但与世界巨头相比仍存较大差距。因此,齿轮异常检测任务对于提升我国工业质检效率,保障产品质量具有重要意义。

1 赛题任务

赛题链接

赛题选取制造领域的齿轮配件异常检测场景,提供真实生产环境数据集,要求基于百度飞桨国产开发框架构建算法模型,比拼算法精度、召回率等指标,从而提升国产框架在工业智能领域的应用能力,解决企业实际生产痛点问题。参赛团队构建算法模型,实现从测试数据集中自动检测齿面黑皮、齿底黑皮、磕碰三类缺陷的目标。

2 数据集介绍

本任务数据集为一汽红旗汽车提供的齿轮配件在生产加工中的真实数据,所有数据在生产流水线中拍摄而得。

根据赛事要求为确保数据隐私性,数据仅可通过赛题平台下载,下载后仅可用于本次比赛,禁止在其他途径传播,各位读者可以到比赛页面注册登录,然后点击页面最下方的【数据下载】获得任务数据集。

请添加图片描述

数据集中的图片均为真实缺陷齿轮的平面展开图,并由专业人员标注。样图中会明确标识影像中所包含的缺陷和识别线类型。

下面从左至右分别为齿轮示意图、原始图像和标注后的例图:

请添加图片描述

下面是典型缺陷的局部放大图:

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

训练数据文件结构

将提供用于训练的图像数据和识别标签,文件夹结构:

|-- Images/train # 存放测试图像数据,jpeg编码图像文件

|-- Annotations # 存放属性标签标注数据

数据标注文件的结构上,属于比较标注的coco格式标注。

3 数据探索性分析(EDA)

本项目的数据探索性分析代码来自之前整理的:COCO数据集目标检测任务EDA模板,并根据数据集的实际情况,对一些参数的设计进行了调整。

# 数据集解压缩:读者可以将数据集上传到AI Studio,再根据实际项目的具体路径,解压数据集
# 注意由于数据集文件名是中文,解压的时候要指定编码(也可以本地对数据集改名后再上传)
!unzip -O GBK data/data163113/齿轮检测数据集.zip -d ./data/
# 调用一些需要的第三方库
import numpy as np
import pandas as pd
import shutil
import json
import os
import cv2
import glob
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import seaborn as sns
from matplotlib.font_manager import FontProperties
from PIL import Image
import random
myfont = FontProperties(fname=r"NotoSansCJKsc-Medium.otf", size=12)
plt.rcParams['figure.figsize'] = (12, 12)
plt.rcParams['font.family']= myfont.get_family()
plt.rcParams['font.sans-serif'] = myfont.get_name()
plt.rcParams['axes.unicode_minus'] = False
# 加载训练集路径
TRAIN_DIR = 'data/齿轮检测数据集/train/'
TRAIN_CSV_PATH = 'data/齿轮检测数据集/train/train_coco.json'
# 加载训练集图片目录
train_fns = glob.glob(TRAIN_DIR + '*')
print('数据集图片数量: {}'.format(len(train_fns)))
数据集图片数量: 2002
def generate_anno_eda(dataset_path, anno_file):
    with open(os.path.join(dataset_path, anno_file)) as f:
        anno = json.load(f)
    print('标签类别:', anno['categories'])
    print('类别数量:', len(anno['categories']))
    print('训练集图片数量:', len(anno['images']))
    print('训练集标签数量:', len(anno['annotations']))
    
    total=[]
    for img in anno['images']:
        hw = (img['height'],img['width'])
        total.append(hw)
    unique = set(total)
    for k in unique:
        print('长宽为(%d,%d)的图片数量为:'%k,total.count(k))
    
    ids=[]
    images_id=[]
    for i in anno['annotations']:
        ids.append(i['id'])
        images_id.append(i['image_id'])
    print('训练集图片数量:', len(anno['images']))
    print('unique id 数量:', len(set(ids)))
    print('unique image_id 数量', len(set(images_id)))
    
    # 创建类别标签字典
    category_dic=dict([(i['id'],i['name']) for i in anno['categories']])
    counts_label=dict([(i['name'],0) for i in anno['categories']])
    for i in anno['annotations']:
        counts_label[category_dic[i['category_id']]] += 1
    label_list = counts_label.keys()    # 各部分标签
    print('标签列表:', label_list)
    size = counts_label.values()    # 各部分大小
    print('标签分布:', size)
    color = ['#FFB6C1', '#D8BFD8', '#9400D3', '#483D8B', '#4169E1', '#00FFFF','#B1FFF0','#ADFF2F','#EEE8AA','#FFA500','#FF6347']     # 各部分颜色
    # explode = [0.05, 0, 0]   # 各部分突出值
    patches, l_text, p_text = plt.pie(size, labels=label_list, colors=color, labeldistance=1.1, autopct="%1.1f%%", shadow=False, startangle=90, pctdistance=0.6, textprops={'fontproperties':myfont})
    plt.axis("equal")    # 设置横轴和纵轴大小相等,这样饼才是圆的
    plt.legend(prop=myfont)
    plt.show()
    # 判断样本是否均衡,给出结论
    if max(size) > 5 * min(size):
        print('c11')
    else:
        print('c10')
def generate_anno_eda(dataset_path, anno_file):
    with open(os.path.join(dataset_path, anno_file)) as f:
        anno = json.load(f)
    print('标签类别:', anno['categories'])
    print('类别数量:', len(anno['categories']))
    print('训练集图片数量:', len(anno['images']))
    print('训练集标签数量:', len(anno['annotations']))
    
    total=[]
    for img in anno['images']:
        hw = (img['height'],img['width'])
        total.append(hw)
    unique = set(total)
    for k in unique:
        print('长宽为(%d,%d)的图片数量为:'%k,total.count(k))
    
    ids=[]
    images_id=[]
    for i in anno['annotations']:
        ids.append(i['id'])
        images_id.append(i['image_id'])
    print('训练集图片数量:', len(anno['images']))
    print('unique id 数量:', len(set(ids)))
    print('unique image_id 数量', len(set(images_id)))
    
    # 创建类别标签字典
    category_dic=dict([(i['id'],i['name']) for i in anno['categories']])
    counts_label=dict([(i['name'],0) for i in anno['categories']])
    for i in anno['annotations']:
        counts_label[category_dic[i['category_id']]] += 1
    label_list = counts_label.keys()    # 各部分标签
    print('标签列表:', label_list)
    size = counts_label.values()    # 各部分大小
    print('标签分布:', size)
    color = ['#FFB6C1', '#D8BFD8', '#9400D3', '#483D8B', '#4169E1', '#00FFFF','#B1FFF0','#ADFF2F','#EEE8AA','#FFA500','#FF6347']     # 各部分颜色
    # explode = [0.05, 0, 0]   # 各部分突出值
    patches, l_text, p_text = plt.pie(size, labels=label_list, colors=color, labeldistance=1.1, autopct="%1.1f%%", shadow=False, startangle=90, pctdistance=0.6, textprops={'fontproperties':myfont})
    plt.axis("equal")    # 设置横轴和纵轴大小相等,这样饼才是圆的
    plt.legend(prop=myfont)
    plt.show()
# 分析训练集数据
generate_anno_eda('data/齿轮检测数据集/train', 'train_coco.json')
标签类别: [{'id': 1, 'name': 'hp_cm', 'supercategory': 'fitow'}, {'id': 2, 'name': 'hp_cd', 'supercategory': 'fitow'}, {'id': 3, 'name': 'kp', 'supercategory': 'fitow'}]
类别数量: 3
训练集图片数量: 2000
训练集标签数量: 28575
长宽为(1500,1000)的图片数量为: 561
长宽为(2000,1400)的图片数量为: 1179
长宽为(1500,1400)的图片数量为: 260
训练集图片数量: 2000
unique id 数量: 28575
unique image_id 数量 1398
标签列表: dict_keys(['hp_cm', 'hp_cd', 'kp'])
标签分布: dict_values([9300, 13520, 5755])

在这里插入图片描述

def generate_anno_result(dataset_path, anno_file):
    with open(os.path.join(dataset_path, anno_file)) as f:
        anno = json.load(f)    
    total=[]
    for img in anno['images']:
        hw = (img['height'],img['width'])
        total.append(hw)
    unique = set(total)
    
    ids=[]
    images_id=[]
    for i in anno['annotations']:
        ids.append(i['id'])
        images_id.append(i['image_id'])
    
    # 创建类别标签字典
    category_dic=dict([(i['id'],i['name']) for i in anno['categories']])
    counts_label=dict([(i['name'],0) for i in anno['categories']])
    for i in anno['annotations']:
        counts_label[category_dic[i['category_id']]] += 1
    label_list = counts_label.keys()    # 各部分标签
    size = counts_label.values()    # 各部分大小

    train_fig = pd.DataFrame(anno['images'])
    train_anno = pd.DataFrame(anno['annotations'])
    df_train = pd.merge(left=train_fig, right=train_anno, how='inner', left_on='id', right_on='image_id')
    df_train['bbox_xmin'] = df_train['bbox'].apply(lambda x: x[0])
    df_train['bbox_ymin'] = df_train['bbox'].apply(lambda x: x[1])
    df_train['bbox_w'] = df_train['bbox'].apply(lambda x: x[2])
    df_train['bbox_h'] = df_train['bbox'].apply(lambda x: x[3])
    df_train['bbox_xcenter'] = df_train['bbox'].apply(lambda x: (x[0]+0.5*x[2]))
    df_train['bbox_ycenter'] = df_train['bbox'].apply(lambda x: (x[1]+0.5*x[3]))

    print('最小目标面积(像素):', min(df_train.area))

    balanced = ''
    small_object = ''
    densely = ''
    # 判断样本是否均衡,给出结论
    if max(size) > 5 * min(size):
        print('样本不均衡')
        balanced = 'c11'
    else:
        print('样本均衡')
        balanced = 'c10'
    # 判断样本是否存在小目标,给出结论
    if min(df_train.area) < 900:
        print('存在小目标')
        small_object = 'c21'
    else:
        print('不存在小目标')
        small_object = 'c20'
    arr1=[]
    arr2=[]
    x=[]
    y=[]
    w=[]
    h=[]
    for index, row in df_train.iterrows():
        if index < 1000:
            # 获取并记录坐标点
            x.append(row['bbox_xcenter'])
            y.append(row['bbox_ycenter'])
            w.append(row['bbox_w'])
            h.append(row['bbox_h'])
    for i in range(len(x)):
        l = np.sqrt(w[i]**2+h[i]**2)
        arr2.append(l)
        for j in range(len(x)):
            a=np.sqrt((x[i]-x[j])**2+(y[i]-y[j])**2)
            if a != 0:
                arr1.append(a)
    arr1=np.matrix(arr1)
    # print(arr1.min())
    # print(np.mean(arr2))
    # 判断是否密集型目标,具体逻辑还需优化
    if arr1.min() <  np.mean(arr2):
        print('密集型目标')
        densely = 'c31'
    else:
        print('非密集型目标')
        densely = 'c30'
    return balanced, small_object, densely
# 分析训练集数据
generate_anno_result('data/齿轮检测数据集/train', 'train_coco.json')
最小目标面积(像素): 90
样本均衡
存在小目标
密集型目标





('c10', 'c21', 'c31')
# 读取训练集标注文件
with open(TRAIN_CSV_PATH, 'r', encoding='utf-8') as f:
    train_data = json.load(f)
train_fig = pd.DataFrame(train_data['images'])
train_fig.head()
idwidthheightfile_namelicensedate_captured
00140020001_10_1__H2_817171_IO-NIO198M_210225A0207-1-1.jpg1
11140020001_10_3__H2_817171_IO-NIO198M_210225A0207-1-2.jpg1
22140020001_10_5__H2_817171_IO-NIO198M_210225A0207-2-1.jpg1
33140020001_10_7__H2_817171_IO-NIO198M_210225A0207-2-2.jpg1
44140020001_10__H2_817171_IO-NIO198M_210303A0125-2-1.jpg1
ps = np.zeros(len(train_fig))
for i in range(len(train_fig)):
    ps[i]=train_fig['width'][i] * train_fig['height'][i]/1e6
plt.title('训练集图片大小分布', fontproperties=myfont)
sns.distplot(ps, bins=21,kde=False)
<matplotlib.axes._subplots.AxesSubplot at 0x7fa74e0dbad0>



/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/font_manager.py:1331: UserWarning: findfont: Font family ['sans-serif'] not found. Falling back to DejaVu Sans
  (prop.get_family(), self.defaultFamily[fontext]))

在这里插入图片描述

train_anno = pd.DataFrame(train_data['annotations'])
df_train = pd.merge(left=train_fig, right=train_anno, how='inner', left_on='id', right_on='image_id')
df_train['bbox_xmin'] = df_train['bbox'].apply(lambda x: x[0])
df_train['bbox_ymin'] = df_train['bbox'].apply(lambda x: x[1])
df_train['bbox_w'] = df_train['bbox'].apply(lambda x: x[2])
df_train['bbox_h'] = df_train['bbox'].apply(lambda x: x[3])
df_train['bbox_xcenter'] = df_train['bbox'].apply(lambda x: (x[0]+0.5*x[2]))
df_train['bbox_ycenter'] = df_train['bbox'].apply(lambda x: (x[1]+0.5*x[3]))
def get_all_bboxes(df, name):
    image_bboxes = df[df.file_name == name]
    
    bboxes = []
    categories = []
    for _,row in image_bboxes.iterrows():
        bboxes.append((row.bbox_xmin, row.bbox_ymin, row.bbox_w, row.bbox_h, row.category_id))
    return bboxes

def plot_image_examples(df, rows=3, cols=3, title='Image examples'):
    fig, axs = plt.subplots(rows, cols, figsize=(15,15))
    color = ['#FFB6C1', '#D8BFD8', '#9400D3', '#483D8B', '#4169E1', '#00FFFF','#B1FFF0','#ADFF2F','#EEE8AA','#FFA500','#FF6347']     # 各部分颜色
    for row in range(rows):
        for col in range(cols):
            idx = np.random.randint(len(df), size=1)[0]
            name = df.iloc[idx]["file_name"]
            img = Image.open(TRAIN_DIR + str(name))
            axs[row, col].imshow(img)
            
            bboxes = get_all_bboxes(df, name)
            for bbox in bboxes:
                rect = patches.Rectangle((bbox[0],bbox[1]),bbox[2],bbox[3],linewidth=1,edgecolor=color[bbox[4]],facecolor='none')
                axs[row, col].add_patch(rect)
            
            axs[row, col].axis('off')
            
    plt.suptitle(title,fontproperties=myfont)
def plot_gray_examples(df, rows=3, cols=3, title='Image examples'):
    fig, axs = plt.subplots(rows, cols, figsize=(15,15))
    color = ['#FFB6C1', '#D8BFD8', '#9400D3', '#483D8B', '#4169E1', '#00FFFF','#B1FFF0','#ADFF2F','#EEE8AA','#FFA500','#FF6347']     # 各部分颜色
    for row in range(rows):
        for col in range(cols):
            idx = np.random.randint(len(df), size=1)[0]
            name = df.iloc[idx]["file_name"]
            img = Image.open(TRAIN_DIR + str(name)).convert('L')
            axs[row, col].imshow(img)
            
            bboxes = get_all_bboxes(df, name)
            for bbox in bboxes:
                rect = patches.Rectangle((bbox[0],bbox[1]),bbox[2],bbox[3],linewidth=1,edgecolor=color[bbox[4]],facecolor='none')
                axs[row, col].add_patch(rect)
            
            axs[row, col].axis('off')

            
    plt.suptitle(title,fontproperties=myfont)
def get_image_brightness(image):
    # convert to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # get average brightness
    return np.array(gray).mean()

def add_brightness(df):
    brightness = []
    for _, row in df.iterrows():    
        name = row["file_name"]
        image = cv2.imread(TRAIN_DIR + name)
        brightness.append(get_image_brightness(image))
        
    brightness_df = pd.DataFrame(brightness)
    brightness_df.columns = ['brightness']
    df = pd.concat([df, brightness_df], ignore_index=True, axis=1)
    df.columns = ['file_name', 'brightness']
    
    return df

images_df = pd.DataFrame(df_train.file_name.unique())
images_df.columns = ['file_name']
brightness_df = add_brightness(images_df)
brightness_df.head()
file_namebrightness
01_1__H2_817171_IO-NIO198M_210316A0004-1-1.jpg71.169135
11_1__H2_817171_IO-NIO198M_210316A0004-1-2.jpg72.130640
21_3__H2_817171_IO-NIO198M_210318A0173-1-1.jpg72.675181
31_3__H2_817171_IO-NIO198M_210401A0156-1-1.jpg73.313945
41_3__H2_817171_IO-NIO198M_210401A0156-1-2.jpg74.656053
dark_names = brightness_df[brightness_df['brightness'] < 50].file_name
plot_image_examples(df_train[df_train.file_name.isin(dark_names)], title='暗图片')
bright_names =  brightness_df[brightness_df['brightness'] > 130].file_name
plot_image_examples(df_train[df_train.file_name.isin(bright_names)], title='亮图片')
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/font_manager.py:1331: UserWarning: findfont: Font family ['sans-serif'] not found. Falling back to DejaVu Sans
  (prop.get_family(), self.defaultFamily[fontext]))

在这里插入图片描述

sns.set(rc={'figure.figsize':(12,6)})
ps = np.zeros(len(brightness_df))
for i in range(len(brightness_df)):
    ps[i]=brightness_df['brightness'][i]
plt.title('图片亮度分布', fontproperties=myfont)
sns.distplot(ps, bins=21,kde=False)
<matplotlib.axes._subplots.AxesSubplot at 0x7fa73a080ad0>

在这里插入图片描述

ps = np.zeros(len(df_train))
for i in range(len(df_train)):
    ps[i]=df_train['area'][i]/1e6
plt.title('训练集目标大小分布', fontproperties=myfont)
sns.distplot(ps, bins=21,kde=False)
<matplotlib.axes._subplots.AxesSubplot at 0x7fa739e4fd90>

在这里插入图片描述

# 各类别目标形状分布
sns.set(rc={'figure.figsize':(12,6)})
sns.relplot(x="bbox_w", y="bbox_h", hue="category_id", col="category_id", data=df_train[0:1000])
<seaborn.axisgrid.FacetGrid at 0x7fa73a106c50>

在这里插入图片描述

# 各类别目标中心点形状分布
sns.set(rc={'figure.figsize':(12,6)})
sns.relplot(x="bbox_xcenter", y="bbox_ycenter", hue="category_id", col="category_id", data=df_train[0:1000]);

在这里插入图片描述

sns.set(rc={'figure.figsize':(12,6)})
plt.title('训练集目标大小分布', fontproperties=myfont)
sns.violinplot(x=df_train['category_id'],y=df_train['area'])
# 训练集目标大小统计结果
df_train.area.describe()
df_train.area.describe()
count     28575.000000
mean      33341.153946
std      102648.831293
min          90.000000
25%        2072.000000
50%        4920.000000
75%        9170.000000
max      918174.000000
Name: area, dtype: float64
sns.set(rc={'figure.figsize':(12,6)})
plt.title('训练集小目标分布', fontproperties=myfont)
plt.ylim(0, 4000)
sns.violinplot(x=df_train['category_id'],y=df_train['area'])
<matplotlib.axes._subplots.AxesSubplot at 0x7f51fa725210>

在这里插入图片描述

sns.set(rc={'figure.figsize':(12,6)})
plt.title('训练集大目标分布', fontproperties=myfont)
plt.ylim(10000, max(df_train.area))
sns.violinplot(x=df_train['category_id'],y=df_train['area'])
graph=sns.countplot(data=df_train, x='category_id')
graph.set_xticklabels(graph.get_xticklabels(), rotation=90)
plt.title('各类别目标数量分布', fontproperties=myfont)
for p in graph.patches:
    height = p.get_height()
    graph.text(p.get_x()+p.get_width()/2., height + 0.1,height ,ha="center")

在这里插入图片描述

df_train['bbox_count'] = df_train.apply(lambda row: 1 if any(row.bbox) else 0, axis=1)
train_images_count = df_train.groupby('file_name').sum().reset_index()
train_images_count['bbox_count'].describe()
count    1398.000000
mean       20.439914
std        22.911058
min         1.000000
25%         4.000000
50%        12.000000
75%        29.000000
max       157.000000
Name: bbox_count, dtype: float64
# 目标数量超过100个的图片
train_images_count['file_name'][train_images_count['bbox_count']>50]
0           1_1__H2_817171_IO-NIO198M_210316A0004-1-1.jpg
45      1__H2_817171_IO-NIONO_ID2021-05-24-11-23-29-1-...
50                     1_chimian_20210423_135111276_3.jpg
52                     1_chimian_20210423_135910318_3.jpg
54                     1_chimian_20210423_140036923_1.jpg
                              ...                        
1242                   4_chimian_20210421_155214825_2.jpg
1270                   4_chimian_20210423_115244729_1.jpg
1274                   4_chimian_20210423_134349569_1.jpg
1276                   4_chimian_20210423_134934808_2.jpg
1279                   4_chimian_20210423_135236926_3.jpg
Name: file_name, Length: 139, dtype: object
# 目标数量超过100个的图片
train_images_count['file_name'][train_images_count['bbox_count']>100]
142     2_H2_817171_IO-NIONO_ID2021-06-15-14-14-18-1-1...
161     2_H2_817171_IO-NIONO_ID2021-06-15-14-35-21-1-1...
171     2_H2_817171_IO-NIONO_ID2021-06-15-14-49-27-1-1...
182     2_H2_817171_IO-NIONO_ID2021-06-15-14-52-00-2-1...
189     2_H2_817171_IO-NIONO_ID2021-06-15-15-05-51-1-1...
560     3_H2_817171_IO-NIONO_ID2021-06-15-14-33-59-2-1...
568     3_H2_817171_IO-NIONO_ID2021-06-15-14-35-46-2-1...
574     3_H2_817171_IO-NIONO_ID2021-06-15-14-49-27-1-1...
579     3_H2_817171_IO-NIONO_ID2021-06-15-14-51-09-1-1...
597     3_H2_817171_IO-NIONO_ID2021-06-15-15-24-25-1-1...
805                    3_chimian_20210423_113634880_3.jpg
1074    4_H2_817171_IO-NIONO_ID2021-06-15-14-35-46-2-1...
1076    4_H2_817171_IO-NIONO_ID2021-06-15-14-36-12-2-1...
1085    4_H2_817171_IO-NIONO_ID2021-06-15-14-51-09-1-1...
1090    4_H2_817171_IO-NIONO_ID2021-06-15-15-04-09-1-1...
1161          4__H2_817171_IO-NIO198M_210329A0242-2-1.jpg
Name: file_name, dtype: object
less_spikes_ids = train_images_count[train_images_count['bbox_count'] > 50].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='单图目标超过50个(示例)')

在这里插入图片描述

less_spikes_ids = train_images_count[train_images_count['bbox_count'] > 100].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='单图目标超过100个(示例)')

在这里插入图片描述

less_spikes_ids = train_images_count[train_images_count['bbox_count'] < 5].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='单图目标少于5个(示例)')

在这里插入图片描述

less_spikes_ids = train_images_count[train_images_count['area'] > max(train_images_count['area'])*0.8].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='目标总面积最大(示例)')

在这里插入图片描述

less_spikes_ids = train_images_count[train_images_count['area'] < min(train_images_count['area'])*1.2].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='目标总面积最小(示例)')

在这里插入图片描述

df_train['bbox_count'] = df_train.apply(lambda row: 1 if any(row.bbox) else 0, axis=1)
train_images_count = df_train.groupby('file_name').max().reset_index()
less_spikes_ids = train_images_count[train_images_count['area'] > max(train_images_count['area'])*0.8].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='单目标面积最大(示例)')

在这里插入图片描述

df_train['bbox_count'] = df_train.apply(lambda row: 1 if any(row.bbox) else 0, axis=1)
train_images_count = df_train.groupby('file_name').min().reset_index()
less_spikes_ids = train_images_count[train_images_count['area'] > min(train_images_count['area'])*5].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='单目标面积最小(示例)')

在这里插入图片描述

# 计算IOU
def bb_intersection_over_union(boxA, boxB):
    boxA = [int(x) for x in boxA]
    boxB = [int(x) for x in boxB]
    boxA = [boxA[0], boxA[1], boxA[0]+boxA[2], boxA[1]+boxA[3]]
    boxB = [boxB[0], boxB[1], boxB[0]+boxB[2], boxB[1]+boxB[3]]
    xA = max(boxA[0], boxB[0])
    yA = max(boxA[1], boxB[1])
    xB = min(boxA[2], boxB[2])
    yB = min(boxA[3], boxB[3])

    interArea = max(0, xB - xA + 1) * max(0, yB - yA + 1)

    boxAArea = (boxA[2] - boxA[0] + 1) * (boxA[3] - boxA[1] + 1)
    boxBArea = (boxB[2] - boxB[0] + 1) * (boxB[3] - boxB[1] + 1)
    
    iou = interArea / float(boxAArea + boxBArea - interArea)

    return iou
# tmp 是一个pandas Series,且索引从0开始
def bbox_iou(tmp):
    iou_agg = 0
    iou_cnt = 0
    for i in range(len(tmp)):
        for j in range(len(tmp)):
            if i != j:
                iou_agg += bb_intersection_over_union(tmp[i], tmp[j])
                if bb_intersection_over_union(tmp[i], tmp[j]) > 0:
                    iou_cnt += 1
    iou_agg = iou_agg/2
    iou_cnt = iou_cnt/2
    return iou_agg, iou_cnt
file_list = df_train['file_name'].unique()
train_iou_cal = pd.DataFrame(columns=('file_name', 'iou_agg', 'iou_cnt'))
for i in range(len(file_list)):
    tmp = df_train['bbox'][df_train.file_name==file_list[i]].reset_index(drop=True)
    iou_agg, iou_cnt = bbox_iou(tmp)
    train_iou_cal.loc[len(train_iou_cal)] = [file_list[i], iou_agg, iou_cnt]
# iou遮挡情况统计数据
train_iou_cal.iou_agg.describe()
count    1398.000000
mean        1.670045
std         4.108700
min         0.000000
25%         0.000000
50%         0.010461
75%         1.182195
max        34.947517
Name: iou_agg, dtype: float64
ps = np.zeros(len(train_iou_cal))
for i in range(len(train_iou_cal)):
    ps[i]=train_iou_cal['iou_agg'][i]
plt.title('训练集目标遮挡程度分布', fontproperties=myfont)
sns.distplot(ps, bins=21,kde=False)
<matplotlib.axes._subplots.AxesSubplot at 0x7f51fa6266d0>

在这里插入图片描述

train_iou_cal.iou_cnt.describe()
count    1398.000000
mean       24.456366
std        65.989039
min         0.000000
25%         0.000000
50%         1.000000
75%        16.000000
max       659.000000
Name: iou_cnt, dtype: float64
ps = np.zeros(len(train_iou_cal))
for i in range(len(train_iou_cal)):
    ps[i]=train_iou_cal['iou_cnt'][i]
plt.title('训练集目标遮挡数量分布', fontproperties=myfont)
sns.distplot(ps, bins=21,kde=False)
<matplotlib.axes._subplots.AxesSubplot at 0x7f51faa99e10>

在这里插入图片描述

less_spikes_ids = train_iou_cal[train_iou_cal['iou_agg'] > max(train_iou_cal['iou_agg'])*0.9].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='目标遮挡程度最高(示例)')

在这里插入图片描述

less_spikes_ids = train_iou_cal[train_iou_cal['iou_agg'] <= min(train_iou_cal['iou_agg'])*1.1].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='目标遮挡程度最低(示例)')

在这里插入图片描述

less_spikes_ids = train_iou_cal[train_iou_cal['iou_cnt'] > max(train_iou_cal['iou_cnt'])*0.9].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='目标遮挡数量最高(示例)')

在这里插入图片描述

less_spikes_ids = train_iou_cal[train_iou_cal['iou_cnt'] <= min(train_iou_cal['iou_cnt'])*1.1].file_name
plot_image_examples(df_train[df_train.file_name.isin(less_spikes_ids)], title='目标遮挡数量最低(示例)')

在这里插入图片描述

分析结论:齿轮瑕疵检测的数据集图片分辨率中等偏大、小目标占比相当大,目标标签基本均衡,部分图片正负样本不均衡。

总的来说,这是个比较典型的小目标检测场景。

4 基线解题思路

本项目基线使用飞桨目标检测开发套件PaddleDetection,它可以端到端地完成从训练到部署的全流程目标检测应用。

前面我们已经基本明确齿轮瑕疵检测属于小目标检测场景,因此,在具体选型时,可以在PaddleDetection提供的数百种模型库里,优先找针对小目标检测优化过的模型,比如:SNIPER: Efficient Multi-Scale TrainingVisDrone-DET 检测模型

4.1 切分数据集

由于赛题只提供了训练集,数据集的切分需要读者自己来做。本基线按照9:1的比例,划分训练集和验证集。

在数据集切分工具上,本基线使用了飞桨全流程开发工具PaddleX提供的一键切分数据集解决方案。

注:在上分阶段,数据集如何切分也是一个重要的优化点。

# 整理数据集结构,移除脏数据
!mv data/齿轮检测数据集/train/train_coco.json data/齿轮检测数据集/
!rm data/齿轮检测数据集/train/Thumbs.db
# 引入PaddleX
!pip install paddlex
# 组织数据目录
!mkdir MyDataset
!mkdir MyDataset/JPEGImages
!mv data/齿轮检测数据集/train/*.jpg MyDataset/JPEGImages/
!mv data/齿轮检测数据集/train_coco.json MyDataset/annotations.json
# 按比例切分数据集
!paddlex --split_dataset --format COCO --dataset_dir MyDataset --val_value 0.1 --test_value 0.0
[32m[08-06 22:42:32 MainThread @logger.py:242][0m Argv: /opt/conda/envs/python35-paddle120-env/bin/paddlex --split_dataset --format COCO --dataset_dir MyDataset --val_value 0.1 --test_value 0.0
[0m[33m[08-06 22:42:32 MainThread @utils.py:79][0m [5m[33mWRN[0m paddlepaddle version: 2.3.1. The dynamic graph version of PARL is under development, not fully tested and supported
[0m/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/parl/remote/communication.py:38: DeprecationWarning: 'pyarrow.default_serialization_context' is deprecated as of 2.0.0 and will be removed in a future version. Use pickle or the pyarrow IPC functionality instead.
  context = pyarrow.default_serialization_context()
[0m/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
[0m/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
[0m/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
[0m2022-08-06 22:42:33,305-WARNING: type object 'QuantizationTransformPass' has no attribute '_supported_quantizable_op_type'
[0m2022-08-06 22:42:33,305-WARNING: If you want to use training-aware and post-training quantization, please use Paddle >= 1.8.4 or develop version
[0m2022-08-06 22:42:34 [INFO]	Dataset split starts...[0m
[0mloading annotations into memory...[0m
[0mDone (t=0.29s)[0m
[0mcreating index...[0m
[0mindex created![0m
[0m2022-08-06 22:42:36 [INFO]	Dataset split done.[0m
[0m2022-08-06 22:42:36 [INFO]	Train samples: 1800[0m
[0m2022-08-06 22:42:36 [INFO]	Eval samples: 200[0m
[0m2022-08-06 22:42:36 [INFO]	Test samples: 0[0m
[0m2022-08-06 22:42:36 [INFO]	Split files saved in MyDataset[0m
[0m[0m

4.2 模型训练

直接使用PaddleDetection进行模型训练的过程可以很简单,改改配置文件就好了。详细说明可查看PaddleDetection模型参数配置教程
本基线使用的模型是ppyolo_r50vd_dcn_1x_sniper_visdrone,相关配置文件如下:

  • configs/datasets/sniper_coco_detection.yml
    • 数据配置文件

configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml
- sniper模型配置文件

  • configs/runtime.yml
    • 运行时配置文件
  • configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml
    • 模型配置文件
  • configs/ppyolo/_base_/optimizer_1x.yml
    • 优化器配置文件
  • configs/sniper/_base_/ppyolo_reader.yml
    • 数据读取配置文件

基线训练了100个epoch,相关配置文件读者可以查看项目中的对应文件。

在使用配置文件,尤其是数据配置文件时,读者请注意尽量使用绝对路径配置数据集,可以避免不少报错。

!git clone https://gitee.com/paddlepaddle/PaddleDetection.git
Cloning into 'PaddleDetection'...
remote: Enumerating objects: 27523, done.[K
remote: Counting objects: 100% (7993/7993), done.[K
remote: Compressing objects: 100% (3200/3200), done.[K
remote: Total 27523 (delta 5978), reused 6496 (delta 4772), pack-reused 19530[K
Receiving objects: 100% (27523/27523), 283.98 MiB | 18.80 MiB/s, done.
Resolving deltas: 100% (20527/20527), done.
Checking connectivity... done.
%cd PaddleDetection
/home/aistudio/PaddleDetection
# 覆盖配置文件
!cp ../sniper_visdrone_detection.yml configs/datasets/sniper_visdrone_detection.yml
!cp ../ppyolo_r50vd_dcn_1x_sniper_visdrone.yml configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml
!cp ../ppyolo_r50vd_dcn.yml configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml 
!cp ../runtime.yml configs/runtime.yml
!cp ../optimizer_1x.yml configs/ppyolo/_base_/optimizer_1x.yml
!cp ../ppyolo_reader.yml configs/sniper/_base_/ppyolo_reader.yml

可选操作:统计数据集信息,获得数据缩放尺度、有效框范围、chip尺度和步长等参数,修改configs/datasets/sniper_coco_detection.yml中对应参数

!python tools/sniper_params_stats.py YOLOv3 /home/aistudio/MyDataset/annotations.json
[08/07 00:59:28] sniper_params_stats INFO: array_percentile(0): 0.005,array_percentile low(8): 0.02, array_percentile high(92): 0.209, array_percentile 100: 0.7493333333333333
[08/07 00:59:28] sniper_params_stats INFO: anchor_log_range:1.0665588534887818, box_ratio_log_range:1.0191162904470727
[08/07 00:59:28] sniper_params_stats INFO: box_cut_num:1, box_ratio_log_window:1.0191162904470727
[08/07 00:59:28] sniper_params_stats INFO: Box cut 0
[08/07 00:59:28] sniper_params_stats INFO: box_ratio_low: 0.020000000000000004
[08/07 00:59:28] sniper_params_stats INFO: image_target_size: 1599.9999999999998
[08/07 00:59:28] sniper_params_stats INFO: valid_ratio: [0.02     0.233125]
[08/07 00:59:28] sniper_params_stats INFO: -----------------------------------------------Result-----------------------------------------------
[08/07 00:59:28] sniper_params_stats INFO: image_target_sizes: [1599]
[08/07 00:59:28] sniper_params_stats INFO: valid_box_ratio_ranges: [[-1, -1]]
[08/07 00:59:28] sniper_params_stats INFO: chip_target_size: 800, chip_target_stride: 448

可选操作:训练检测器,生成负样本

!python tools/train.py -c configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml --save_proposals --proposals_path=./proposals.json &>sniper.log 2>&1 &
# 开始训练
!python tools/train.py -c configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml --use_vdl=True --vdl_log_dir=./sniper/ --eval

由于训练过程很长,这里建议读者使用后台任务执行训练操作,相关设置可参考:后台任务使用说明

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-v05db0EP-1660297595607)(https://ai-studio-static-online.cdn.bcebos.com/0b254c8829da46ee8f22192789b208f216453828fff842d7a621110b0f8d9c3b)]

然后再将后台任务的输出加载回项目中,继续进行评估操作。

4.3 模型评估

训练过程可以点击左侧的可视化按钮查看趋势:

在这里插入图片描述

基线模型训练过程如下:

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

!python tools/eval.py -c configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml -o weights=output/ppyolo_r50vd_dcn_1x_sniper_visdrone/best_model.pdparams
W0808 19:11:21.758419  2856 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0808 19:11:21.762652  2856 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[08/08 19:11:21] sniper_coco_dataset INFO: Init AnnoCropper...
[08/08 19:11:22] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyolo_r50vd_dcn_1x_sniper_visdrone/best_model.pdparams
[08/08 19:11:32] ppdet.engine INFO: Eval iter: 0
[08/08 19:13:46] ppdet.engine INFO: Eval iter: 100
[08/08 19:15:53] ppdet.engine INFO: Eval iter: 200
[08/08 19:17:59] ppdet.engine INFO: Eval iter: 300
[08/08 19:22:23] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json.
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
[08/08 19:22:23] ppdet.metrics.coco_utils INFO: Start evaluate...
Loading and preparing results...
DONE (t=0.45s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=7.65s).
Accumulating evaluation results...
DONE (t=0.28s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.206
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.489
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.148
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.113
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.257
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.097
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.026
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.155
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.365
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.313
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.409
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.226
[08/08 19:22:31] ppdet.engine INFO: Total sample number: 2568, averge FPS: 6.086981044767421

4.4 预测推理

# 挑一张验证集的图片展示预测效果
!python tools/infer.py -c configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml -o weights=output/ppyolo_r50vd_dcn_1x_sniper_visdrone/best_model --infer_img=../data/齿轮检测数据集/train/3_chimian_23671223_20210513_112814450_1.jpg
W0808 19:25:52.022071  4947 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0808 19:25:52.026350  4947 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.
[08/08 19:25:52] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyolo_r50vd_dcn_1x_sniper_visdrone/best_model.pdparams
[08/08 19:25:52] sniper_coco_dataset INFO: Init AnnoCropper...
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
100%|███████████████████████████████████████████| 12/12 [00:02<00:00,  3.34it/s]
[08/08 19:25:56] ppdet.engine INFO: Detection bbox results save in output/3_chimian_23671223_20210513_112814450_1.jpg

在这里插入图片描述

5 生成提交结果

这里演示如何对测试集进行批量图片预测,并直接生成符合提交格式的文件。

# 准备一个放测试集图片的目录,然后将待预测的示例图片移至该目录下
# 测试集文件本项目暂不提供,请读者自己准备
!unzip -O GBK ../data/data163113/齿轮检测A榜评测数据.zip -d ../data/
!mkdir ../data/test
!mv ../data/齿轮检测A榜评测数据/val/*.jpg ../data/test/
# 执行批量预测,并生成含有预测结果的bbox.json文件
!python tools/infer.py -c configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml -o weights=output/ppyolo_r50vd_dcn_1x_sniper_visdrone/best_model --infer_dir=../data/test --save_results=True
images = set()
infer_dir = os.path.abspath('../data/test')
assert os.path.isdir(infer_dir), \
    "infer_dir {} is not a directory".format(infer_dir)
exts = ['jpg', 'jpeg', 'png', 'bmp']
exts += [ext.upper() for ext in exts]
for ext in exts:
    images.update(glob.glob('{}/*.{}'.format(infer_dir, ext)))
images = list(images)
with open('output/bbox.json', 'r') as f:
    results = json.load(f)
upload_json = []
for i in range(len(results)):
    dt = {}
    dt['name'] = os.path.basename(images[results[i]['image_id']])
    dt['category_id'] = results[i]['category_id']
    dt['bbox'] = results[i]['bbox']
    dt['score'] = results[i]['score']
    upload_json.append(dt)
# 生成上传文件
with open('../upload.json','w') as f:
    json.dump(upload_json,f)

6 优化空间

本项目作为齿轮瑕疵检测任务的基线方案,尚有很大的提升空间。读者可以考虑探索以下方向:

  • 使用PaddleDetection模型库的其它算法并微调
  • 使用更加复杂有效的数据集切分方案
  • 训练更多轮次、调整学习率
  • 使用集成模型
  • 微调各种超参如输入分辨率,继续“炼丹”

声明

此项目为搬运
原项目链接

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐