工业安全生产环境违规使用手机的识别

赛题背景:
现今,手机已成为人们生活和工作的必需品。但在工业生产环境中,工作人员违规使用手机,屡屡造成安全生产事故,甚至引起人员伤亡。因此,基于工业安全生产和员工人身安全考虑,越来越多的工厂建立起员工手机使用管理规范,限制或禁止员工在生产过程中使用手机。目前,传统的管理手段有两种:一是禁止将手机带入厂区,二是人员监督核查。两种办法均会耗费大量人力,且无法高效、准确地发现员工违规使用手机的情况。如果引入人工智能技术,对设置在生产工区内摄像头采集的视频图片进行分析,快速、准确识别员工违规使用手机的行为,并进行提示和规范,可有效加强安全生产监管,实现降本、提质、增效,加速数字化转型进程。

赛题任务:
由于工业场景特性,存在目标过小、物体区分度不明显、周围背景复杂等问题,给正确识别带来较大的挑战,目前,业界平均识别正确率维持在80%左右,存在进一步优化和提升的空间。参赛团队需要结合提供的数据训练集,识别判断图片上的人物是否存在使用手机行为,提高识别的正确率。方法不限,分类方法、目标检测方法等均可。

  • 1.问题定义、举例说明

图片数据为人体区域数据,每个图片判断是否存在使用手机行为。

图片示例如下。

2.难点分析

实际工厂环境中,各种机器或者类手机办公用品,在与手部重叠时,AI算法会很难分辨是手机还是其他物体,这也是本任务的挑战之一。

3.数据说明

  • 训练集图片,包含正例和反例数据。0文件夹表示正例图片目录,共计8202张图片,为使用手机样本;1文件夹表示反例图片目录,共计10600张图片,为无手机样本。
  • 图像的尺寸都不相同,训练时图片尺寸统一为256 * 256。
  • 测试集图片共计4079张,在test_images_a文件夹中。
!cd 'data/data105762' && unzip -q train_img.zip

数据EDA

探索性数据分析(Exploratory Data Analysis,简称EDA),是指对已有的数据(原始数据)进行分析探索,通过作图、制表、方程拟合、计算特征量等手段探索数据的结构和规律的一种数据分析方法。一般来说,我们最初接触到数据的时候往往是毫无头绪的,不知道如何下手,这时候探索性数据分析就非常有效。本项目中训练集图片,包含正例和反例数据。0表示正例图片目录,共计8202张图片,1表示反例图片目录,共计10600张图片。

import pandas as pd

df = pd.read_csv('train.csv')
d=df['category_id'].hist().get_figure()
d.savefig('eda.jpg')

数据读取与模型训练、保存

# 导入所需要的库
from sklearn.utils import shuffle
import os
import pandas as pd
import numpy as np
from PIL import Image

import paddle
import paddle.nn as nn
from paddle.io import Dataset
import paddle.vision.transforms as T
import paddle.nn.functional as F
from paddle.metric import Accuracy

from paddle.vision.models import resnet18
from sklearn.metrics import f1_score
from res2net import Res2Net50_vd_26w_4s

import warnings
warnings.filterwarnings("ignore")

# 读取数据
train_images = pd.read_csv('/home/aistudio/train.csv')

# labelshuffling

def labelShuffling(dataFrame, groupByName = 'category_id'):

    groupDataFrame = dataFrame.groupby(by=[groupByName])
    labels = groupDataFrame.size()
    print("length of label is ", len(labels))
    maxNum = max(labels)
    lst = pd.DataFrame()
    for i in range(len(labels)):
        print("Processing label  :", i)
        tmpGroupBy = groupDataFrame.get_group(i)
        createdShuffleLabels = np.random.permutation(np.array(range(maxNum))) % labels[i]
        print("Num of the label is : ", labels[i])
        lst=lst.append(tmpGroupBy.iloc[createdShuffleLabels], ignore_index=True)
        print("Done")
    # lst.to_csv('test1.csv', index=False)
    return lst

# 划分训练集和校验集
all_size = len(train_images)
# print(all_size)
train_size = int(all_size * 0.85)
train_image_list = train_images[:train_size]
val_image_list = train_images[train_size:]

df = labelShuffling(train_image_list)
df = shuffle(df)

train_image_path_list = df['image_id'].values
label_list = df['category_id'].values
label_list = paddle.to_tensor(label_list, dtype='int64')
train_label_list = label_list
# train_label_list = paddle.nn.functional.one_hot(label_list, num_classes=2)

val_image_path_list = val_image_list['image_id'].values
val_label_list = val_image_list['category_id'].values
val_label_list = paddle.to_tensor(val_label_list, dtype='int64')
# val_label_list = paddle.nn.functional.one_hot(val_label_list, num_classes=2)

# 定义数据预处理
data_transforms = T.Compose([
    T.Resize(size=(256, 256)),
    T.RandomHorizontalFlip(224),
    T.RandomVerticalFlip(224),
    T.RandomRotation(30),
    T.Transpose(),    # HWC -> CHW
    T.Normalize(
        mean=[0, 0, 0],        # 归一化
        std=[255, 255, 255],
        to_rgb=True)    
])

# 构建Dataset
class MyDataset(paddle.io.Dataset):
    """
    步骤一:继承paddle.io.Dataset类
    """
    def __init__(self, train_img_list, val_img_list,train_label_list,val_label_list, mode='train'):
        """
        步骤二:实现构造函数,定义数据读取方式,划分训练和测试数据集
        """
        super(MyDataset, self).__init__()
        self.img = []
        self.label = []
        # 借助pandas读csv的库
        self.train_images = train_img_list
        self.test_images = val_img_list
        self.train_label = train_label_list
        self.test_label = val_label_list
        self.bbox_dict = self.load_bbox()

        if mode == 'train':
            # 读train_images的数据
            for img,la in zip(self.train_images, self.train_label):
                self.img.append('data/data105762/train_img/'+img)
                self.label.append(la)
        else:
            # 读test_images的数据
            for img,la in zip(self.train_images, self.train_label):
                self.img.append('data/data105762/train_img/'+img)
                self.label.append(la)

    def load_bbox(self):
        # Image,x0,y0,x1,y1
        print('loading bbox...')
        bbox = pd.read_csv('bboxs.csv')
        Images = bbox['Image'].tolist()
        x0s = bbox['x0'].tolist()
        y0s = bbox['y0'].tolist()
        x1s = bbox['x1'].tolist()
        y1s = bbox['y1'].tolist()
        bbox_dict = {}
        for Image,x0,y0,x1,y1 in zip(Images,x0s,y0s,x1s,y1s):
            bbox_dict[Image] = [x0, y0, x1, y1]
        return bbox_dict

    def load_img(self, image_path):
        # 实际使用时使用Pillow相关库进行图片读取即可,这里我们对数据先做个模拟
        image = Image.open(image_path).convert('RGB')
        # try:
            # name = image_path.split('/')[-1]
            # # print(name)
            # x0, y0, x1, y1 = self.bbox_dict[name]
            # image = image[int(y0):int(y1), int(x0):int(x1)]
        # except:
            # image = image
        return image

    def __getitem__(self, index):
        """
        步骤三:实现__getitem__方法,定义指定index时如何获取数据,并返回单条数据(训练数据,对应的标签)
        """
        image = self.load_img(self.img[index])

        label = self.label[index]
        # label = paddle.to_tensor(label)
        
        return data_transforms(image), label

    def __len__(self):
        """
        步骤四:实现__len__方法,返回数据集总数目
        """
        return len(self.img)

#train_loader
train_dataset = MyDataset(train_img_list=train_image_path_list, val_img_list=val_image_path_list, train_label_list=train_label_list, val_label_list=val_label_list, mode='train')
train_loader = paddle.io.DataLoader(train_dataset, places=paddle.CPUPlace(), batch_size=128, shuffle=True, num_workers=0)

#val_loader
val_dataset = MyDataset(train_img_list=train_image_path_list, val_img_list=val_image_path_list, train_label_list=train_label_list, val_label_list=val_label_list, mode='test')
val_loader = paddle.io.DataLoader(train_dataset, places=paddle.CPUPlace(), batch_size=128, shuffle=True, num_workers=0)

# 模型封装
model_res = Res2Net50_vd_26w_4s(class_dim=2)
model = paddle.Model(model_res)

# 定义优化器

# scheduler = paddle.optimizer.lr.LinearWarmup(
#         learning_rate=0.5, warmup_steps=20, start_lr=0, end_lr=0.5, verbose=True)
# optim = paddle.optimizer.SGD(learning_rate=scheduler, parameters=model.parameters())
optim = paddle.optimizer.Adam(learning_rate=3e-4, parameters=model.parameters())

# 配置模型
model.prepare(
    optim,
    paddle.nn.CrossEntropyLoss(),
    Accuracy()
    )

model.load('Res2Net50_vd_26w_4s_pretrained.pdparams',skip_mismatch=True)

# 模型训练与评估
model.fit(train_loader,
        val_loader,
        log_freq=1,
        epochs=15,
        # callbacks=Callbk(write=write, iters=iters),
        verbose=1,
        )

# result = model.predict(val_dataset, batch_size=64)
# print(len(result[0]), result[0][0].shape)

# 保存模型参数
# model.save('Hapi_MyCNN')  # save for training
model.save('Hapi_MyCNN1', False)  # save for inference

# model_file_path="Hapi_MyCNN1"

# model = paddle.jit.load(model_file_path)

# paddle.onnx.export(model, 'test')



查看F1-Score

import os, time
import matplotlib.pyplot as plt
import paddle
from PIL import Image
import numpy as np
import pandas as pd

use_gpu = True
model_file_path="Hapi_MyCNN1"
paddle.set_device('gpu:0') if use_gpu else paddle.set_device('cpu')
model = paddle.jit.load(model_file_path)

model.eval() #训练模式

def load_image(img_path):
    '''
    预测图片预处理
    '''
    img = Image.open(img_path).convert('RGB')
    
    #resize
    img = img.resize((256, 256), Image.BILINEAR) #Image.BILINEAR双线性插值
    img = np.array(img).astype('float32')

    # HWC to CHW 
    img = img.transpose((2, 0, 1))
    
    #Normalize
    img = img / 255         #像素值归一化
    # print(img)
    # mean = [0.44258407, 0.4834136, 0.2998949]   
    # std = [0.2839005, 0.28273728, 0.27038324]
    # img[0] = (img[0] - mean[0]) / std[0]
    # img[1] = (img[1] - mean[1]) / std[1]
    # img[2] = (img[2] - mean[2]) / std[2]
    
    return img

def infer_img(path, model):
    '''
    模型预测
    '''
    #对预测图片进行预处理
    infer_imgs = []
    infer_imgs.append(load_image(path))
    infer_imgs = np.array(infer_imgs)
    label_pre = []
    for i in range(len(infer_imgs)):
        data = infer_imgs[i]
        dy_x_data = np.array(data).astype('float32')
        dy_x_data = dy_x_data[np.newaxis,:, : ,:]
        img = paddle.to_tensor(dy_x_data)
        out = model(img)
        # print(out[0])
        # print(paddle.nn.functional.softmax(out)[0]) # 若模型中已经包含softmax则不用此行代码。

        lab = np.argmax(out.numpy())  #argmax():返回最大数的索引
        label_pre.append(int(lab))
       
    return label_pre

# img_list = os.listdir('data/data105762/test_images_a')

img_list = val_image_path_list
pre_list = []

for i in range(len(img_list)):
    pre_list.append(infer_img(path='data/data105762/train_img/' + img_list[i], model=model)[0])

img = pd.DataFrame(img_list)
img = img.rename(columns = {0:"image_id"})
img['category_id'] = pre_list

def score(y_true, y_pred):
    # 自定义的f1 socre
    return 0.4 * f1_score(y_true, y_pred, pos_label=1) + 0.6 * f1_score(y_true, y_pred, pos_label=0)

base_score = score(val_label_list, img['category_id'].values)
print(base_score)

生成提交文件

import os, time
import matplotlib.pyplot as plt
import paddle
from PIL import Image
import numpy as np
import pandas as pd

use_gpu = True
model_file_path="Hapi_MyCNN1"
paddle.set_device('gpu:0') if use_gpu else paddle.set_device('cpu')
model = paddle.jit.load(model_file_path)

model.eval() #训练模式

def load_image(img_path):
    '''
    预测图片预处理
    '''
    img = Image.open(img_path).convert('RGB')
    
    #resize
    img = img.resize((256, 256), Image.BILINEAR) #Image.BILINEAR双线性插值
    img = np.array(img).astype('float32')

    # HWC to CHW 
    img = img.transpose((2, 0, 1))
    
    #Normalize
    img = img / 255         #像素值归一化
    # print(im
    
    return img

def infer_img(path, model):
    '''
    模型预测
    '''
    #对预测图片进行预处理
    infer_imgs = []
    infer_imgs.append(load_image(path))
    infer_imgs = np.array(infer_imgs)
    label_pre = []
    for i in range(len(infer_imgs)):
        data = infer_imgs[i]
        dy_x_data = np.array(data).astype('float32')
        dy_x_data = dy_x_data[np.newaxis,:, : ,:]
        img = paddle.to_tensor(dy_x_data)
        out = model(img)
        # print(out[0])
        # print(paddle.nn.functional.softmax(out)[0]) # 若模型中已经包含softmax则不用此行代码。

        lab = np.argmax(out.numpy())  #argmax():返回最大数的索引
        label_pre.append(int(lab))
       
    return label_pre

img_list = os.listdir('data/data105762/train_img/test_images_a')

# img_list = val_image_path_list
pre_list = []

for i in range(len(img_list)):
    pre_list.append(infer_img(path='data/data105762/train_img/test_images_a/' + img_list[i], model=model)[0])

img = pd.DataFrame(img_list)
img = img.rename(columns = {0:"image_name"})
img['class_id'] = pre_list

'data/data105762/train_img/test_images_a')

# img_list = val_image_path_list
pre_list = []

for i in range(len(img_list)):
    pre_list.append(infer_img(path='data/data105762/train_img/test_images_a/' + img_list[i], model=model)[0])

img = pd.DataFrame(img_list)
img = img.rename(columns = {0:"image_name"})
img['class_id'] = pre_list

img.to_csv('test.csv', index=False)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/backward.py:1666: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  return list(x) if isinstance(x, collections.Sequence) else [x]
  • 初赛结果

总结

主要从二分类维度考虑图片分类。本题的最终解决方案应该是目标检测+分类的方案。

baseline只是用了0/1标签信息,其实主办方提供的目标识别信息可以对图形进一步的细化,因为一张大图太大了,如果可以缩小点,具有更好的信息,更细,更细的信息,而且图片小了,训练速度也就快了。

类似于kaggle之前鲸鱼尾分类挑战赛的第一名方案,因此在该方案中,直接提取了手机的标注区域进行训练。

还有一种方案就是直接当做一个类别的目标检测进行处理,把无标注信息的图片也送入网络进行训练,即yolov4的方法。大家可以参考。

该比赛目前已经结束,但是前几名的解决方案还未能找到,大家如果有相关消息可以互相交流。

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐