2021青海省首届河湟杯数据湖算法大赛—车辆多属性识别赛道初赛baseline(非官方)

转载自AI Studio 项目链接https://aistudio.baidu.com/aistudio/projectdetail/3511579

一、赛事背景

当前,我国数字经济发展突飞猛进,许多领域都取得了骄人的成果,在社会发展中也更加注重利用数字经济发展,使其能够更有力地推动社会的发展。同时,人工智能的研究成果,也在一些方面表现出计算机要胜过人类的能力。当然,数字经济发展的核心问题仍旧是把人类社会与数据更好的链接起来,处理好科技与社会的协调发展,为能够更好更快的提高社会发展质量,解决社会发展中的一些实际问题,特举办本届大赛。
大赛秉承公正公开,开放创新的理念,旨在通过大赛构建良好的资源共享平台,以青海为主,在全国范围内发现和培养大数据技术人才,提高全社会大数据人才的数据科学思维、实践能力和协作能力,激励人才的探索精神,增强大数据技术在青海省数字经济产业化中的创新应用,进一步推动青海大数据的产学研用。

二、赛事任务

车辆的增加普及导致交通问题日益严重,智能交通系统应时而生,加之人工智能技术的发展,使车辆的识别管理成为了智能交通系统中的重要一环。针对车辆识别管理问题,举办了本次“车辆多属性识别”比赛,为交通、公安等部门处理道路交通事件提供依据。
本赛题期望选手利用大量监控视频下的车辆图片训练相关模型,从而解决车辆类型的识别、车辆行驶方向的识别以及车辆车身颜色识别三大问题。

初赛采用A/B榜,提供训练数据集,供参赛选手训练算法模型;提供测试数据集,供参赛选手提交评测结果参与排名。在初赛过程中,选手需识别图片中车辆的类型并按照格式规范提交结果。

三、评审规则

数据示例

初赛训练数据集分为两部分:图片与标签,其中图片为.jpg格式,标签为.csv格式。标签数据包含2部分内容,分别是“id”、“type”,其中“id”为字符串类型,对应图片名称;“type”为字符串类型,对应图片中车辆的类型。在初赛阶段,type共4类,分别是:car、suv、van、truck。
样例:

!unzip -oq /home/aistudio/work/testA.zip
!unzip -oq /home/aistudio/work/train.zip

数据EDA

  探索性数据分析(Exploratory Data Analysis,简称EDA),是指对已有的数据(原始数据)进行分析探索,通过作图、制表、方程拟合、计算特征量等手段探索数据的结构和规律的一种数据分析方法。一般来说,我们最初接触到数据的时候往往是毫无头绪的,不知道如何下手,这时候探索性数据分析就非常有效。

对于图像分类任务,我们通常首先应该统计出每个类别的数量,查看训练集的数据分布情况。通过数据分布情况分析赛题,形成解题思路。(洞察数据的本质很重要。)

baseline搭建

数据预处理

通过上面的数据EDA发现类别存在不均衡的问题,

# 导入所需要的库
from sklearn.utils import shuffle
import os
import pandas as pd
import numpy as np
from PIL import Image

import paddle
import paddle.nn as nn
from paddle.io import Dataset
import paddle.vision.transforms as T
import paddle.nn.functional as F
from paddle.metric import Accuracy

import warnings
warnings.filterwarnings("ignore")

# 读取数据
train_images = pd.read_csv('train_sorted1.csv')

train_images = shuffle(train_images)

from sklearn.preprocessing import LabelEncoder

encoder=LabelEncoder()
housing_cat=train_images["label"]
housing_cat_encoded=encoder.fit_transform(housing_cat)
train_images["label"]=pd.DataFrame(housing_cat_encoded)

# print(train_images)
# labelshuffling

def labelShuffling(dataFrame, groupByName = 'label'):

    groupDataFrame = dataFrame.groupby(by=[groupByName])
    labels = groupDataFrame.size()
    print("length of label is ", len(labels))
    maxNum = max(labels)
    lst = pd.DataFrame()
    for i in range(len(labels)):
        print("Processing label  :", i)
        tmpGroupBy = groupDataFrame.get_group(i)
        createdShuffleLabels = np.random.permutation(np.array(range(maxNum))) % labels[i]
        print("Num of the label is : ", labels[i])
        lst=lst.append(tmpGroupBy.iloc[createdShuffleLabels], ignore_index=True)
        print("Done")
    # lst.to_csv('test1.csv', index=False)
    return lst

# 划分训练集和校验集
all_size = len(train_images)
# print(all_size)
train_size = int(all_size * 0.8)
train_image_list = train_images[:train_size]
val_image_list = train_images[train_size:]

df = train_image_list

# print(df)
df = labelShuffling(train_image_list)
df = shuffle(df)


train_image_path_list = df['image'].values
label_list = df['label'].values
label_list = paddle.to_tensor(label_list, dtype='int64')
train_label_list = paddle.nn.functional.one_hot(label_list, num_classes=4)

val_image_path_list = val_image_list['image'].values
val_label_list = val_image_list['label'].values
val_label_list = paddle.to_tensor(val_label_list, dtype='int64')
val_label_list = paddle.nn.functional.one_hot(val_label_list, num_classes=4)

# 定义数据预处理
data_transforms = T.Compose([
    T.Resize(size=(256, 256)),
    T.RandomHorizontalFlip(0.5),
    T.RandomVerticalFlip(0.5),
    T.RandomRotation(30),
    T.Transpose(),    # HWC -> CHW
    T.Normalize(
        mean=[0, 0, 0],        # 归一化
        std=[255, 255, 255],
        to_rgb=True)    
])
length of label is  4
Processing label  : 0
Num of the label is :  1448
Done
Processing label  : 1
Num of the label is :  587
Done
Processing label  : 2
Num of the label is :  140
Done
Processing label  : 3
Num of the label is :  149
Done


W0224 22:03:39.271940  5640 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0224 22:03:39.275599  5640 device_context.cc:465] device: 0, cuDNN Version: 7.6.
# 构建Dataset
class MyDataset(paddle.io.Dataset):
    """
    步骤一:继承paddle.io.Dataset类
    """
    def __init__(self, train_img_list, val_img_list,train_label_list,val_label_list, mode='train'):
        """
        步骤二:实现构造函数,定义数据读取方式,划分训练和测试数据集
        """
        super(MyDataset, self).__init__()
        self.img = []
        self.label = []
        # 借助pandas读csv的库
        self.train_images = train_img_list
        self.test_images = val_img_list
        self.train_label = train_label_list
        self.test_label = val_label_list
        if mode == 'train':
            # 读train_images的数据
            for img,la in zip(self.train_images, self.train_label):
                self.img.append('train/train/'+img)
                self.label.append(la)
        else:
            # 读test_images的数据
            for img,la in zip(self.test_images, self.test_label):
                self.img.append('train/train/'+img)
                self.label.append(la)

    def load_img(self, image_path):
        # 实际使用时使用Pillow相关库进行图片读取即可,这里我们对数据先做个模拟
        image = Image.open(image_path).convert('RGB')
        return image

    def __getitem__(self, index):
        """
        步骤三:实现__getitem__方法,定义指定index时如何获取数据,并返回单条数据(训练数据,对应的标签)
        """
        image = self.load_img(self.img[index])
        label = self.label[index]
        # label = paddle.to_tensor(label)
        
        return data_transforms(image), paddle.nn.functional.label_smooth(label)

    def __len__(self):
        """
        步骤四:实现__len__方法,返回数据集总数目
        """
        return len(self.img)
#train_loader
train_dataset = MyDataset(train_img_list=train_image_path_list, val_img_list=val_image_path_list, train_label_list=train_label_list, val_label_list=val_label_list, mode='train')
train_loader = paddle.io.DataLoader(train_dataset, places=paddle.CPUPlace(), batch_size=32, shuffle=True, num_workers=0)

#val_loader
val_dataset = MyDataset(train_img_list=train_image_path_list, val_img_list=val_image_path_list, train_label_list=train_label_list, val_label_list=val_label_list, mode='test')
val_loader = paddle.io.DataLoader(val_dataset, places=paddle.CPUPlace(), batch_size=32, shuffle=True, num_workers=0)

模型训练

from res2net import Res2Net50_vd_26w_4s

# 模型封装
model_res = Res2Net50_vd_26w_4s(class_dim=4)
model = paddle.Model(model_res)

# 定义优化器

# scheduler = paddle.optimizer.lr.LinearWarmup(
#         learning_rate=0.5, warmup_steps=20, start_lr=0, end_lr=0.5, verbose=True)
# optim = paddle.optimizer.SGD(learning_rate=scheduler, parameters=model.parameters())
optim = paddle.optimizer.Adam(learning_rate=3e-4, parameters=model.parameters())

# 配置模型
model.prepare(
    optim,
    paddle.nn.CrossEntropyLoss(soft_label=True),
    Accuracy()
    )

model.load('Res2Net50_vd_26w_4s_pretrained.pdparams',skip_mismatch=True)

# 模型训练与评估
model.fit(train_loader,
        val_loader,
        log_freq=1,
        epochs=15,
        verbose=1,
        )
The loss value printed in the log is the current step, and the metric is the average value of previous steps.
Epoch 1/15
step 181/181 [==============================] - loss: 1.3492 - acc: 0.3267 - 231ms/step         
Eval begin...
step 9/9 [==============================] - loss: 1.3039 - acc: 0.2548 - 154ms/step        
Eval samples: 259
Epoch 2/15
step 181/181 [==============================] - loss: 1.1146 - acc: 0.4498 - 228ms/step        
Eval begin...
step 9/9 [==============================] - loss: 1.5971 - acc: 0.3205 - 153ms/step        
Eval samples: 259
Epoch 3/15
step 181/181 [==============================] - loss: 0.8872 - acc: 0.5568 - 229ms/step        
Eval begin...
step 9/9 [==============================] - loss: 1.7063 - acc: 0.3591 - 146ms/step        
Eval samples: 259
Epoch 4/15
step 181/181 [==============================] - loss: 0.9437 - acc: 0.6499 - 229ms/step        
Eval begin...
step 9/9 [==============================] - loss: 1.4568 - acc: 0.2394 - 144ms/step        
Eval samples: 259
Epoch 5/15
step 181/181 [==============================] - loss: 0.8166 - acc: 0.7210 - 227ms/step         
Eval begin...
step 9/9 [==============================] - loss: 0.8642 - acc: 0.4324 - 148ms/step        
Eval samples: 259
Epoch 6/15
step 181/181 [==============================] - loss: 0.7758 - acc: 0.7629 - 227ms/step        
Eval begin...
step 9/9 [==============================] - loss: 0.8764 - acc: 0.4981 - 148ms/step        
Eval samples: 259
Epoch 7/15
step 181/181 [==============================] - loss: 0.6554 - acc: 0.7987 - 226ms/step         
Eval begin...
step 9/9 [==============================] - loss: 1.0617 - acc: 0.3050 - 146ms/step        
Eval samples: 259
Epoch 8/15
step 181/181 [==============================] - loss: 0.6482 - acc: 0.8358 - 230ms/step        
Eval begin...
step 9/9 [==============================] - loss: 2.0823 - acc: 0.3745 - 149ms/step        
Eval samples: 259
Epoch 9/15
step 181/181 [==============================] - loss: 0.6632 - acc: 0.8524 - 231ms/step        
Eval begin...
step 9/9 [==============================] - loss: 1.5684 - acc: 0.4363 - 146ms/step        
Eval samples: 259
Epoch 10/15
step 181/181 [==============================] - loss: 0.6055 - acc: 0.8726 - 228ms/step        
Eval begin...
step 9/9 [==============================] - loss: 1.1803 - acc: 0.4479 - 149ms/step        
Eval samples: 259
Epoch 11/15
step 181/181 [==============================] - loss: 0.5604 - acc: 0.8955 - 228ms/step        
Eval begin...
step 9/9 [==============================] - loss: 2.5635 - acc: 0.4942 - 147ms/step        
Eval samples: 259
Epoch 12/15
step 181/181 [==============================] - loss: 0.5882 - acc: 0.9054 - 227ms/step        
Eval begin...
step 9/9 [==============================] - loss: 1.1620 - acc: 0.5135 - 146ms/step        
Eval samples: 259
Epoch 13/15
step 181/181 [==============================] - loss: 0.5160 - acc: 0.9187 - 226ms/step         
Eval begin...
step 9/9 [==============================] - loss: 2.2730 - acc: 0.5560 - 146ms/step        
Eval samples: 259
Epoch 14/15
step 181/181 [==============================] - loss: 0.4833 - acc: 0.9330 - 228ms/step        
Eval begin...
step 9/9 [==============================] - loss: 0.7435 - acc: 0.4942 - 144ms/step        
Eval samples: 259
Epoch 15/15
step 181/181 [==============================] - loss: 0.4185 - acc: 0.9408 - 227ms/step         
Eval begin...
step 9/9 [==============================] - loss: 1.3306 - acc: 0.5521 - 149ms/step        
Eval samples: 259
# 保存模型参数
# model.save('Hapi_MyCNN')  # save for training
model.save('Hapi_MyCNN1', False)  # save for inference

查看F1-Score

此处根据评审规则查看模型在验证集上的表现。

!pip install patta
import os, time
import matplotlib.pyplot as plt
import paddle
from PIL import Image
import numpy as np
import pandas as pd
from sklearn.metrics import f1_score,classification_report
import patta as tta

use_gpu = True
model_file_path="Hapi_MyCNN1"
paddle.set_device('gpu:0') if use_gpu else paddle.set_device('cpu')
model = paddle.jit.load(model_file_path)
model = tta.ClassificationTTAWrapper(model, tta.aliases.ten_crop_transform(224,224))

model.eval() #训练模式

def load_image(img_path):
    '''
    预测图片预处理
    '''
    img = Image.open(img_path).convert('RGB')
    
    #resize
    img = img.resize((256, 256), Image.BILINEAR) #Image.BILINEAR双线性插值
    img = np.array(img).astype('float32')

    # HWC to CHW 
    img = img.transpose((2, 0, 1))
    
    #Normalize
    img = img / 255         #像素值归一化

    return img

def infer_img(path, model):
    '''
    模型预测
    '''
    #对预测图片进行预处理
    infer_imgs = []
    infer_imgs.append(load_image(path))
    infer_imgs = np.array(infer_imgs)
    label_pre = []
    for i in range(len(infer_imgs)):
        data = infer_imgs[i]
        dy_x_data = np.array(data).astype('float32')
        dy_x_data = dy_x_data[np.newaxis,:, : ,:]
        img = paddle.to_tensor(dy_x_data)
        out = model(img)
       
        lab = np.argmax(out.numpy())  #argmax():返回最大数的索引
        label_pre.append(int(lab))
       
    return label_pre


img_list = val_image_path_list
pre_list = []

for i in range(len(img_list)):
    pre_list.append(infer_img(path='train/train/' + img_list[i], model=model)[0])

img = pd.DataFrame(img_list)
img = img.rename(columns = {0:"image_id"})
img['category_id'] = pre_list

base_score = f1_score(val_image_list['label'].values, img['category_id'].values, average='macro')
clc_report = classification_report(val_image_list['label'].values, img['category_id'].values)
print(base_score)
print(clc_report)
0.20938058534405718
              precision    recall  f1-score   support

           0       0.61      0.80      0.69       162
           1       0.21      0.11      0.15        62
           2       0.00      0.00      0.00        24
           3       0.00      0.00      0.00        11

    accuracy                           0.53       259
   macro avg       0.20      0.23      0.21       259
weighted avg       0.43      0.53      0.47       259

模型预测

import os, time
import matplotlib.pyplot as plt
import paddle
from PIL import Image
import numpy as np
import patta as tta
import pandas as pd

use_gpu = True
# model_file_path="Hapi_MyCNN"
paddle.set_device('gpu:0') if use_gpu else paddle.set_device('cpu')
model = paddle.jit.load('/home/aistudio/Hapi_MyCNN1')

model = tta.ClassificationTTAWrapper(model, tta.aliases.ten_crop_transform(224,224))
model.eval() #训练模式

def load_image(img_path):
    '''
    预测图片预处理
    '''
    img = Image.open(img_path).convert('RGB')
    
    #resize
    img = img.resize((256, 256), Image.BILINEAR) #Image.BILINEAR双线性插值
    img = np.array(img).astype('float32')

    # HWC to CHW 
    img = img.transpose((2, 0, 1))
    
    #Normalize
    img = img / 255         #像素值归一化
    
    return img

def infer_img(path, model):
    '''
    模型预测
    '''
    #对预测图片进行预处理

    label_pre = []
    labeled_img = []
    data = load_image(path)

    dy_x_data = np.array(data).astype('float32')
    dy_x_data = dy_x_data[np.newaxis,:, : ,:]
    img = paddle.to_tensor(dy_x_data)
    out = model(img)
 
    res = paddle.nn.functional.softmax(out)[0] # 若模型中已经包含softmax则不用此行代码。
    lab = np.argmax(out.numpy())  #argmax():返回最大数的索引

    if res[lab].numpy()[0] >= 0.95:
        label_pre.append(int(lab))
        labeled_img.append(path)
    return label_pre

img_list = os.listdir('testA/')
img_list.sort()
img_list.sort(key=lambda x: int((x[:-4]).split('_')[-1]))  ##文件名按数字排序

pre_list = []
labeled_img_list = []
for i in range(len(img_list)):
    data = load_image(img_path='testA/' + img_list[i])
   
    dy_x_data = np.array(data).astype('float32')
    dy_x_data = dy_x_data[np.newaxis,:, : ,:]
    img = paddle.to_tensor(dy_x_data)
    out = model(img)
    res = paddle.nn.functional.softmax(out)[0] # 若模型中已经包含softmax则不用此行代码。
    lab = np.argmax(out.numpy())  #argmax():返回最大数的索引
    pre_list.append(int(lab))
    labeled_img_list.append(img_list[i])
    
encoder.inverse_transform
img = pd.DataFrame(labeled_img_list)
img = img.rename(columns = {0:"id"})
img['type'] = encoder.inverse_transform(pre_list)

():返回最大数的索引
    pre_list.append(int(lab))
    labeled_img_list.append(img_list[i])
    
encoder.inverse_transform
img = pd.DataFrame(labeled_img_list)
img = img.rename(columns = {0:"id"})
img['type'] = encoder.inverse_transform(pre_list)

img.to_csv('result.csv', index=False)
# len(labeled_img_list)
# len(pre_list)
# import pandas as pd

# df1 = pd.read_csv('train_sorted1.csv')
# df2 = pd.read_csv('result.csv')

# result = pd.concat([df1, df2], axis=0)

# result.to_csv('train_sorted2.csv', index=False)




提交结果

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐