MixConv:混合感受野的深度可分离卷积
本文系统地研究了不同内核大小的影响,并观察到结合多个内核大小的优点可以提高准确性和效率。基于这一观察,本文提出了一种新的混合深度卷积MixConv,它在一个卷积中自然地混合了多个内核大小...
MixConv:混合感受野的深度可分离卷积
摘要
深度卷积在现代高效网络中越来越流行,但其内核大小往往被忽视。在本文中,我们系统地研究了不同内核大小的影响,并观察到结合多个内核大小的优点可以提高准确性和效率。基于这一观察,我们提出了一种新的混合深度卷积(MixConv),它在一个卷积中自然地混合了多个内核大小。作为vanilla深度卷积的简单替代,我们的MixConv提高了现有MobileNet在ImageNet分类和COCO目标检测方面的准确性和效率。为了证明MixConv的有效性,我们将其集成到AutoML搜索空间中,并开发了一个新的模型系列,称为MixNets,它优于以前的移动模型,包括MobileNetV2[23](ImageNet top-1精度+4.2%)、ShuffleNetV2[18](+3.5%)、MnasNet[29](+1.3%)、ProxylessNAS[2](+2.2%)和FBNet[30](+2.0%)。特别是,我们的MixNet-L在典型的移动设置下(<600M次)达到了78.9%的最先进的ImageNet top-1精度。
1. MixConv
不同于常规的depth-wise卷积,对所有的通道使用相同尺寸的卷积核,MixConv对通道分组,不同的组使用不同尺寸的卷积核。
2. 代码复现
2.1 下载并导入所需要的包
!pip install paddlex
%matplotlib inline
import paddle
import paddle.fluid as fluid
import numpy as np
import matplotlib.pyplot as plt
from paddle.vision.datasets import Cifar10
from paddle.vision.transforms import Transpose
from paddle.io import Dataset, DataLoader
from paddle import nn
import paddle.nn.functional as F
import paddle.vision.transforms as transforms
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
import paddlex
from paddle import ParamAttr
2.2 创建数据集
train_tfm = transforms.Compose([
transforms.Resize((130, 130)),
transforms.ColorJitter(brightness=0.2,contrast=0.2, saturation=0.2),
paddlex.transforms.MixupImage(),
transforms.RandomResizedCrop(128, scale=(0.6, 1.0)),
transforms.RandomHorizontalFlip(0.5),
transforms.RandomRotation(20),
transforms.ToTensor(),
transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
])
test_tfm = transforms.Compose([
transforms.Resize((128, 128)),
transforms.ToTensor(),
transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
])
paddle.vision.set_image_backend('cv2')
# 使用Cifar10数据集
train_dataset = Cifar10(data_file='data/data152754/cifar-10-python.tar.gz', mode='train', transform = train_tfm, )
val_dataset = Cifar10(data_file='data/data152754/cifar-10-python.tar.gz', mode='test',transform = test_tfm)
print("train_dataset: %d" % len(train_dataset))
print("val_dataset: %d" % len(val_dataset))
train_dataset: 50000
val_dataset: 10000
batch_size=128
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True, num_workers=2)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, drop_last=False, num_workers=2)
2.3 标签平滑
class LabelSmoothingCrossEntropy(nn.Layer):
def __init__(self, smoothing=0.1):
super().__init__()
self.smoothing = smoothing
def forward(self, pred, target):
confidence = 1. - self.smoothing
log_probs = F.log_softmax(pred, axis=-1)
idx = paddle.stack([paddle.arange(log_probs.shape[0]), target], axis=1)
nll_loss = paddle.gather_nd(-log_probs, index=idx)
smooth_loss = paddle.mean(-log_probs, axis=-1)
loss = confidence * nll_loss + self.smoothing * smooth_loss
return loss.mean()
2.4 AlexNet-MixConv
2.4.1 MixConv
def _SplitChannels(channels, num_groups):
split_channels = [channels//num_groups for _ in range(num_groups)]
split_channels[0] += channels - sum(split_channels)
return split_channels
class MixConv(nn.Layer):
# num_groups表示将通道分成几组
def __init__(self, num_channels, num_filters, filter_size, stride, padding, num_groups=4, name=None):
super().__init__()
self.num_groups = num_groups
self.split_in_channels = _SplitChannels(num_channels, num_groups)
self.split_out_channels = _SplitChannels(num_filters, num_groups)
self.mixconvs = nn.LayerList()
for i in range(num_groups):
self.mixconvs.append(nn.Conv2D(self.split_in_channels[i], self.split_out_channels[i],
2 * i + 3, stride, (2 * i + 3)//2, groups=self.split_in_channels[i], bias_attr=False))
def forward(self, x):
if self.num_groups == 1:
return self.mixconvs[0](x)
x_split = paddle.split(x, self.split_in_channels, axis=1)
x = [conv(t) for conv, t in zip(self.mixconvs, x_split)]
x = paddle.concat(x, axis=1)
return x
model = MixConv(16, 64, 3, 1, 2, 4)
paddle.summary(model, (1, 16, 224, 224))
---------------------------------------------------------------------------
Layer (type) Input Shape Output Shape Param #
===========================================================================
Conv2D-1 [[1, 4, 224, 224]] [1, 16, 224, 224] 144
Conv2D-2 [[1, 4, 224, 224]] [1, 16, 224, 224] 400
Conv2D-3 [[1, 4, 224, 224]] [1, 16, 224, 224] 784
Conv2D-4 [[1, 4, 224, 224]] [1, 16, 224, 224] 1,296
===========================================================================
Total params: 2,624
Trainable params: 2,624
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 3.06
Forward/backward pass size (MB): 24.50
Params size (MB): 0.01
Estimated Total Size (MB): 27.57
---------------------------------------------------------------------------
W0724 18:52:12.671222 979 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0724 18:52:12.674866 979 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.
{'total_params': 2624, 'trainable_params': 2624}
class AlexNet_Mixconv(nn.Layer):
def __init__(self, num_classes=10):
super().__init__()
self.features=nn.Sequential(
nn.Conv2D(3, 48, 11, 4, 11//2),
nn.ReLU(),
nn.MaxPool2D(kernel_size=3,stride=2),
nn.Conv2D(48, 128, 5, 1, 2),
nn.ReLU(),
nn.MaxPool2D(kernel_size=3,stride=2),
MixConv(128, 256, 3, 1, 1),
nn.ReLU(),
MixConv(256, 256, 3, 1, 1),
nn.ReLU(),
nn.Conv2D(256, 128, 3, 1, 1),
nn.ReLU(),
nn.MaxPool2D(kernel_size=3,stride=2),
)
self.classifier=nn.Sequential(
nn.Linear(3*3*128,2048),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(2048,2048),
nn.ReLU(),
nn.Dropout(),
nn.Linear(2048,num_classes),
)
def forward(self,x):
x = self.features(x)
x = paddle.flatten(x, 1)
x=self.classifier(x)
return x
model = AlexNet_Mixconv(num_classes=10)
paddle.summary(model, (1, 3, 128, 128))
---------------------------------------------------------------------------
Layer (type) Input Shape Output Shape Param #
===========================================================================
Conv2D-5 [[1, 3, 128, 128]] [1, 48, 32, 32] 17,472
ReLU-5 [[1, 48, 32, 32]] [1, 48, 32, 32] 0
MaxPool2D-1 [[1, 48, 32, 32]] [1, 48, 15, 15] 0
Conv2D-6 [[1, 48, 15, 15]] [1, 128, 15, 15] 153,728
ReLU-6 [[1, 128, 15, 15]] [1, 128, 15, 15] 0
MaxPool2D-2 [[1, 128, 15, 15]] [1, 128, 7, 7] 0
Conv2D-7 [[1, 32, 7, 7]] [1, 64, 7, 7] 576
Conv2D-8 [[1, 32, 7, 7]] [1, 64, 7, 7] 1,600
Conv2D-9 [[1, 32, 7, 7]] [1, 64, 7, 7] 3,136
Conv2D-10 [[1, 32, 7, 7]] [1, 64, 7, 7] 5,184
MixConv-2 [[1, 128, 7, 7]] [1, 256, 7, 7] 0
ReLU-7 [[1, 256, 7, 7]] [1, 256, 7, 7] 0
Conv2D-11 [[1, 64, 7, 7]] [1, 64, 7, 7] 576
Conv2D-12 [[1, 64, 7, 7]] [1, 64, 7, 7] 1,600
Conv2D-13 [[1, 64, 7, 7]] [1, 64, 7, 7] 3,136
Conv2D-14 [[1, 64, 7, 7]] [1, 64, 7, 7] 5,184
MixConv-3 [[1, 256, 7, 7]] [1, 256, 7, 7] 0
ReLU-8 [[1, 256, 7, 7]] [1, 256, 7, 7] 0
Conv2D-15 [[1, 256, 7, 7]] [1, 128, 7, 7] 295,040
ReLU-9 [[1, 128, 7, 7]] [1, 128, 7, 7] 0
MaxPool2D-3 [[1, 128, 7, 7]] [1, 128, 3, 3] 0
Linear-1 [[1, 1152]] [1, 2048] 2,361,344
ReLU-10 [[1, 2048]] [1, 2048] 0
Dropout-1 [[1, 2048]] [1, 2048] 0
Linear-2 [[1, 2048]] [1, 2048] 4,196,352
ReLU-11 [[1, 2048]] [1, 2048] 0
Dropout-2 [[1, 2048]] [1, 2048] 0
Linear-3 [[1, 2048]] [1, 10] 20,490
===========================================================================
Total params: 7,065,418
Trainable params: 7,065,418
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.19
Forward/backward pass size (MB): 2.09
Params size (MB): 26.95
Estimated Total Size (MB): 29.23
---------------------------------------------------------------------------
{'total_params': 7065418, 'trainable_params': 7065418}
2.5 训练
learning_rate = 0.001
n_epochs = 50
paddle.seed(42)
np.random.seed(42)
work_path = 'work/model'
model = AlexNet_Mixconv(num_classes=10)
criterion = LabelSmoothingCrossEntropy()
scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=learning_rate, T_max=50000 // batch_size * n_epochs, verbose=False)
optimizer = paddle.optimizer.Adam(parameters=model.parameters(), learning_rate=scheduler, weight_decay=1e-5)
gate = 0.0
threshold = 0.0
best_acc = 0.0
val_acc = 0.0
loss_record = {'train': {'loss': [], 'iter': []}, 'val': {'loss': [], 'iter': []}} # for recording loss
acc_record = {'train': {'acc': [], 'iter': []}, 'val': {'acc': [], 'iter': []}} # for recording accuracy
loss_iter = 0
acc_iter = 0
for epoch in range(n_epochs):
# ---------- Training ----------
model.train()
train_num = 0.0
train_loss = 0.0
val_num = 0.0
val_loss = 0.0
accuracy_manager = paddle.metric.Accuracy()
val_accuracy_manager = paddle.metric.Accuracy()
print("#===epoch: {}, lr={:.10f}===#".format(epoch, optimizer.get_lr()))
for batch_id, data in enumerate(train_loader):
x_data, y_data = data
labels = paddle.unsqueeze(y_data, axis=1)
logits = model(x_data)
loss = criterion(logits, y_data)
acc = paddle.metric.accuracy(logits, labels)
accuracy_manager.update(acc)
if batch_id % 10 == 0:
loss_record['train']['loss'].append(loss.numpy())
loss_record['train']['iter'].append(loss_iter)
loss_iter += 1
loss.backward()
optimizer.step()
scheduler.step()
optimizer.clear_grad()
train_loss += loss
train_num += len(y_data)
total_train_loss = (train_loss / train_num) * batch_size
train_acc = accuracy_manager.accumulate()
acc_record['train']['acc'].append(train_acc)
acc_record['train']['iter'].append(acc_iter)
acc_iter += 1
# Print the information.
print("#===epoch: {}, train loss is: {}, train acc is: {:2.2f}%===#".format(epoch, total_train_loss.numpy(), train_acc*100))
# ---------- Validation ----------
model.eval()
for batch_id, data in enumerate(val_loader):
x_data, y_data = data
labels = paddle.unsqueeze(y_data, axis=1)
with paddle.no_grad():
logits = model(x_data)
loss = criterion(logits, y_data)
acc = paddle.metric.accuracy(logits, labels)
val_accuracy_manager.update(acc)
val_loss += loss
val_num += len(y_data)
total_val_loss = (val_loss / val_num) * batch_size
loss_record['val']['loss'].append(total_val_loss.numpy())
loss_record['val']['iter'].append(loss_iter)
val_acc = val_accuracy_manager.accumulate()
acc_record['val']['acc'].append(val_acc)
acc_record['val']['iter'].append(acc_iter)
print("#===epoch: {}, val loss is: {}, val acc is: {:2.2f}%===#".format(epoch, total_val_loss.numpy(), val_acc*100))
# ===================save====================
if val_acc > best_acc:
best_acc = val_acc
paddle.save(model.state_dict(), os.path.join(work_path, 'best_model.pdparams'))
paddle.save(optimizer.state_dict(), os.path.join(work_path, 'best_optimizer.pdopt'))
print(best_acc)
paddle.save(model.state_dict(), os.path.join(work_path, 'final_model.pdparams'))
paddle.save(optimizer.state_dict(), os.path.join(work_path, 'final_optimizer.pdopt'))
#===epoch: 0, lr=0.0010000000===#
#===epoch: 0, train loss is: [1.8332046], train acc is: 37.31%===#
#===epoch: 0, val loss is: [1.5604398], val acc is: 51.62%===#
#===epoch: 1, lr=0.0009990134===#
#===epoch: 1, train loss is: [1.6065372], train acc is: 49.95%===#
#===epoch: 1, val loss is: [1.4431052], val acc is: 58.04%===#
#===epoch: 2, lr=0.0009960574===#
#===epoch: 2, train loss is: [1.5085369], train acc is: 54.82%===#
#===epoch: 2, val loss is: [1.3632524], val acc is: 62.80%===#
#===epoch: 3, lr=0.0009911436===#
#===epoch: 3, train loss is: [1.4487118], train acc is: 58.17%===#
#===epoch: 3, val loss is: [1.300965], val acc is: 65.54%===#
#===epoch: 4, lr=0.0009842916===#
#===epoch: 4, train loss is: [1.4092876], train acc is: 60.00%===#
#===epoch: 4, val loss is: [1.2562582], val acc is: 66.83%===#
#===epoch: 5, lr=0.0009755283===#
#===epoch: 5, train loss is: [1.3632561], train acc is: 62.24%===#
#===epoch: 5, val loss is: [1.2178173], val acc is: 68.88%===#
#===epoch: 6, lr=0.0009648882===#
#===epoch: 6, train loss is: [1.336868], train acc is: 63.29%===#
#===epoch: 6, val loss is: [1.2229024], val acc is: 69.12%===#
#===epoch: 7, lr=0.0009524135===#
#===epoch: 7, train loss is: [1.3104041], train acc is: 64.57%===#
#===epoch: 7, val loss is: [1.1900145], val acc is: 70.71%===#
#===epoch: 8, lr=0.0009381533===#
#===epoch: 8, train loss is: [1.290957], train acc is: 65.48%===#
#===epoch: 8, val loss is: [1.1561929], val acc is: 71.58%===#
#===epoch: 9, lr=0.0009221640===#
#===epoch: 9, train loss is: [1.2713991], train acc is: 66.38%===#
#===epoch: 9, val loss is: [1.1582695], val acc is: 71.44%===#
#===epoch: 10, lr=0.0009045085===#
#===epoch: 10, train loss is: [1.256575], train acc is: 67.27%===#
#===epoch: 10, val loss is: [1.1347567], val acc is: 72.52%===#
#===epoch: 11, lr=0.0008852566===#
#===epoch: 11, train loss is: [1.2349375], train acc is: 68.05%===#
#===epoch: 11, val loss is: [1.1213648], val acc is: 73.43%===#
#===epoch: 12, lr=0.0008644843===#
#===epoch: 12, train loss is: [1.2255381], train acc is: 68.28%===#
#===epoch: 12, val loss is: [1.1092324], val acc is: 73.64%===#
#===epoch: 13, lr=0.0008422736===#
#===epoch: 13, train loss is: [1.211379], train acc is: 69.12%===#
#===epoch: 13, val loss is: [1.0956241], val acc is: 74.48%===#
#===epoch: 14, lr=0.0008187120===#
#===epoch: 14, train loss is: [1.2031662], train acc is: 69.55%===#
#===epoch: 14, val loss is: [1.0745709], val acc is: 75.38%===#
#===epoch: 15, lr=0.0007938926===#
#===epoch: 15, train loss is: [1.1895174], train acc is: 70.18%===#
#===epoch: 15, val loss is: [1.081457], val acc is: 75.05%===#
#===epoch: 16, lr=0.0007679134===#
#===epoch: 16, train loss is: [1.1810952], train acc is: 70.33%===#
#===epoch: 16, val loss is: [1.0502316], val acc is: 76.76%===#
#===epoch: 17, lr=0.0007408768===#
#===epoch: 17, train loss is: [1.1669109], train acc is: 71.11%===#
#===epoch: 17, val loss is: [1.05597], val acc is: 76.17%===#
#===epoch: 18, lr=0.0007128896===#
#===epoch: 18, train loss is: [1.1530827], train acc is: 71.83%===#
#===epoch: 18, val loss is: [1.047121], val acc is: 76.39%===#
#===epoch: 19, lr=0.0006840623===#
#===epoch: 19, train loss is: [1.145995], train acc is: 72.13%===#
#===epoch: 19, val loss is: [1.023506], val acc is: 77.50%===#
#===epoch: 20, lr=0.0006545085===#
#===epoch: 20, train loss is: [1.1302441], train acc is: 72.87%===#
#===epoch: 20, val loss is: [1.0353966], val acc is: 77.25%===#
#===epoch: 21, lr=0.0006243449===#
#===epoch: 21, train loss is: [1.121871], train acc is: 73.14%===#
#===epoch: 21, val loss is: [1.0212026], val acc is: 78.35%===#
#===epoch: 22, lr=0.0005936907===#
#===epoch: 22, train loss is: [1.1141613], train acc is: 73.35%===#
#===epoch: 22, val loss is: [1.0185486], val acc is: 78.14%===#
#===epoch: 23, lr=0.0005626666===#
#===epoch: 23, train loss is: [1.1039811], train acc is: 73.73%===#
#===epoch: 23, val loss is: [1.0128148], val acc is: 78.19%===#
#===epoch: 24, lr=0.0005313953===#
#===epoch: 24, train loss is: [1.0911169], train acc is: 74.48%===#
#===epoch: 24, val loss is: [1.0095358], val acc is: 78.32%===#
#===epoch: 25, lr=0.0005000000===#
#===epoch: 25, train loss is: [1.0807213], train acc is: 74.97%===#
#===epoch: 25, val loss is: [0.9975869], val acc is: 78.58%===#
#===epoch: 26, lr=0.0004686047===#
#===epoch: 26, train loss is: [1.0722029], train acc is: 75.41%===#
#===epoch: 26, val loss is: [1.0026505], val acc is: 78.49%===#
#===epoch: 27, lr=0.0004373334===#
#===epoch: 27, train loss is: [1.0646522], train acc is: 75.48%===#
#===epoch: 27, val loss is: [0.97930884], val acc is: 79.76%===#
#===epoch: 28, lr=0.0004063093===#
#===epoch: 28, train loss is: [1.0545902], train acc is: 75.98%===#
#===epoch: 28, val loss is: [0.97865117], val acc is: 79.64%===#
#===epoch: 29, lr=0.0003756551===#
#===epoch: 29, train loss is: [1.0444697], train acc is: 76.59%===#
#===epoch: 29, val loss is: [0.96476597], val acc is: 80.13%===#
#===epoch: 30, lr=0.0003454915===#
#===epoch: 30, train loss is: [1.037737], train acc is: 76.71%===#
#===epoch: 30, val loss is: [0.9573404], val acc is: 80.40%===#
#===epoch: 31, lr=0.0003159377===#
#===epoch: 31, train loss is: [1.0279362], train acc is: 77.33%===#
#===epoch: 31, val loss is: [0.9777868], val acc is: 79.72%===#
#===epoch: 32, lr=0.0002871104===#
#===epoch: 32, train loss is: [1.0181235], train acc is: 77.45%===#
#===epoch: 32, val loss is: [0.9529455], val acc is: 80.93%===#
#===epoch: 33, lr=0.0002591232===#
#===epoch: 33, train loss is: [1.0126927], train acc is: 78.02%===#
#===epoch: 33, val loss is: [0.9532719], val acc is: 80.87%===#
#===epoch: 34, lr=0.0002320866===#
#===epoch: 34, train loss is: [1.0000519], train acc is: 78.33%===#
#===epoch: 34, val loss is: [0.94096804], val acc is: 81.23%===#
#===epoch: 35, lr=0.0002061074===#
#===epoch: 35, train loss is: [0.9938114], train acc is: 78.71%===#
#===epoch: 35, val loss is: [0.9470541], val acc is: 81.20%===#
#===epoch: 36, lr=0.0001812880===#
#===epoch: 36, train loss is: [0.98845834], train acc is: 78.93%===#
#===epoch: 36, val loss is: [0.93678844], val acc is: 81.60%===#
#===epoch: 37, lr=0.0001577264===#
#===epoch: 37, train loss is: [0.98586285], train acc is: 78.99%===#
#===epoch: 37, val loss is: [0.93320215], val acc is: 81.62%===#
#===epoch: 38, lr=0.0001355157===#
#===epoch: 38, train loss is: [0.9760749], train acc is: 79.36%===#
#===epoch: 38, val loss is: [0.9337833], val acc is: 81.80%===#
#===epoch: 39, lr=0.0001147434===#
#===epoch: 39, train loss is: [0.9714146], train acc is: 79.79%===#
#===epoch: 39, val loss is: [0.9247616], val acc is: 82.04%===#
#===epoch: 40, lr=0.0000954915===#
#===epoch: 40, train loss is: [0.9661569], train acc is: 79.73%===#
#===epoch: 40, val loss is: [0.92751354], val acc is: 81.85%===#
#===epoch: 41, lr=0.0000778360===#
#===epoch: 41, train loss is: [0.9587111], train acc is: 80.09%===#
#===epoch: 41, val loss is: [0.92223495], val acc is: 82.17%===#
#===epoch: 42, lr=0.0000618467===#
#===epoch: 42, train loss is: [0.959283], train acc is: 80.08%===#
#===epoch: 42, val loss is: [0.92457324], val acc is: 82.09%===#
#===epoch: 43, lr=0.0000475865===#
#===epoch: 43, train loss is: [0.9539068], train acc is: 80.48%===#
#===epoch: 43, val loss is: [0.9255979], val acc is: 82.05%===#
#===epoch: 44, lr=0.0000351118===#
#===epoch: 44, train loss is: [0.9511745], train acc is: 80.55%===#
#===epoch: 44, val loss is: [0.91929686], val acc is: 82.39%===#
#===epoch: 45, lr=0.0000244717===#
#===epoch: 45, train loss is: [0.9490688], train acc is: 80.66%===#
#===epoch: 45, val loss is: [0.91968024], val acc is: 82.34%===#
#===epoch: 46, lr=0.0000157084===#
#===epoch: 46, train loss is: [0.9512466], train acc is: 80.48%===#
#===epoch: 46, val loss is: [0.9192811], val acc is: 82.42%===#
#===epoch: 47, lr=0.0000088564===#
#===epoch: 47, train loss is: [0.94866693], train acc is: 80.64%===#
#===epoch: 47, val loss is: [0.91951287], val acc is: 82.40%===#
#===epoch: 48, lr=0.0000039426===#
#===epoch: 48, train loss is: [0.948632], train acc is: 80.63%===#
#===epoch: 48, val loss is: [0.9191138], val acc is: 82.36%===#
#===epoch: 49, lr=0.0000009866===#
#===epoch: 49, train loss is: [0.94711], train acc is: 80.66%===#
#===epoch: 49, val loss is: [0.91893375], val acc is: 82.38%===#
0.8241693037974683
2.6 实验结果
def plot_learning_curve(record, title='loss', ylabel='CE Loss'):
''' Plot learning curve of your CNN '''
maxtrain = max(map(float, record['train'][title]))
maxval = max(map(float, record['val'][title]))
ymax = max(maxtrain, maxval) * 1.1
mintrain = min(map(float, record['train'][title]))
minval = min(map(float, record['val'][title]))
ymin = min(mintrain, minval) * 0.9
total_steps = len(record['train'][title])
x_1 = list(map(int, record['train']['iter']))
x_2 = list(map(int, record['val']['iter']))
figure(figsize=(10, 6))
plt.plot(x_1, record['train'][title], c='tab:red', label='train')
plt.plot(x_2, record['val'][title], c='tab:cyan', label='val')
plt.ylim(ymin, ymax)
plt.xlabel('Training steps')
plt.ylabel(ylabel)
plt.title('Learning curve of {}'.format(title))
plt.legend()
plt.show()
plot_learning_curve(loss_record, title='loss', ylabel='CE Loss')
plot_learning_curve(acc_record, title='acc', ylabel='Accuracy')
import time
work_path = 'work/model'
model = AlexNet_Mixconv(num_classes=10)
model_state_dict = paddle.load(os.path.join(work_path, 'best_model.pdparams'))
model.set_state_dict(model_state_dict)
model.eval()
aa = time.time()
for batch_id, data in enumerate(val_loader):
x_data, y_data = data
labels = paddle.unsqueeze(y_data, axis=1)
with paddle.no_grad():
logits = model(x_data)
bb = time.time()
print("Throughout:{}".format(int(len(val_dataset)//(bb - aa))))
Throughout:1134
def get_cifar10_labels(labels):
"""返回CIFAR10数据集的文本标签。"""
text_labels = [
'airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog',
'horse', 'ship', 'truck']
return [text_labels[int(i)] for i in labels]
def show_images(imgs, num_rows, num_cols, pred=None, gt=None, scale=1.5):
"""Plot a list of images."""
figsize = (num_cols * scale, num_rows * scale)
_, axes = plt.subplots(num_rows, num_cols, figsize=figsize)
axes = axes.flatten()
for i, (ax, img) in enumerate(zip(axes, imgs)):
if paddle.is_tensor(img):
ax.imshow(img.numpy())
else:
ax.imshow(img)
ax.axes.get_xaxis().set_visible(False)
ax.axes.get_yaxis().set_visible(False)
if pred or gt:
ax.set_title("pt: " + pred[i] + "\ngt: " + gt[i])
return axes
work_path = 'work/model'
X, y = next(iter(DataLoader(val_dataset, batch_size=18)))
model = AlexNet_Mixconv(num_classes=10)
model_state_dict = paddle.load(os.path.join(work_path, 'best_model.pdparams'))
model.set_state_dict(model_state_dict)
model.eval()
logits = model(X)
y_pred = paddle.argmax(logits, -1)
X = paddle.transpose(X, [0, 2, 3, 1])
axes = show_images(X.reshape((18, 128, 128, 3)), 1, 18, pred=get_cifar10_labels(y_pred), gt=get_cifar10_labels(y))
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
3. AlexNet
3.1 AlexNet
class AlexNet(nn.Layer):
def __init__(self,num_classes=10):
super(AlexNet, self).__init__()
self.features=nn.Sequential(
nn.Conv2D(3,48, kernel_size=11, stride=4, padding=11//2),
nn.ReLU(),
nn.MaxPool2D(kernel_size=3,stride=2),
nn.Conv2D(48,128, kernel_size=5, padding=2),
nn.ReLU(),
nn.MaxPool2D(kernel_size=3,stride=2),
nn.Conv2D(128, 256,kernel_size=3,stride=1,padding=1),
nn.ReLU(),
nn.Conv2D(256,256,kernel_size=3,stride=1,padding=1),
nn.ReLU(),
nn.Conv2D(256,128,kernel_size=3,stride=1,padding=1),
nn.ReLU(),
nn.MaxPool2D(kernel_size=3,stride=2),
)
self.classifier=nn.Sequential(
nn.Linear(3*3*128,2048),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(2048,2048),
nn.ReLU(),
nn.Dropout(),
nn.Linear(2048,num_classes),
)
def forward(self,x):
x = self.features(x)
x = paddle.flatten(x, 1)
x=self.classifier(x)
return x
model = AlexNet(num_classes=10)
paddle.summary(model, (1, 3, 128, 128))
---------------------------------------------------------------------------
Layer (type) Input Shape Output Shape Param #
===========================================================================
Conv2D-49 [[1, 3, 128, 128]] [1, 48, 32, 32] 17,472
ReLU-33 [[1, 48, 32, 32]] [1, 48, 32, 32] 0
MaxPool2D-13 [[1, 48, 32, 32]] [1, 48, 15, 15] 0
Conv2D-50 [[1, 48, 15, 15]] [1, 128, 15, 15] 153,728
ReLU-34 [[1, 128, 15, 15]] [1, 128, 15, 15] 0
MaxPool2D-14 [[1, 128, 15, 15]] [1, 128, 7, 7] 0
Conv2D-51 [[1, 128, 7, 7]] [1, 256, 7, 7] 295,168
ReLU-35 [[1, 256, 7, 7]] [1, 256, 7, 7] 0
Conv2D-52 [[1, 256, 7, 7]] [1, 256, 7, 7] 590,080
ReLU-36 [[1, 256, 7, 7]] [1, 256, 7, 7] 0
Conv2D-53 [[1, 256, 7, 7]] [1, 128, 7, 7] 295,040
ReLU-37 [[1, 128, 7, 7]] [1, 128, 7, 7] 0
MaxPool2D-15 [[1, 128, 7, 7]] [1, 128, 3, 3] 0
Linear-13 [[1, 1152]] [1, 2048] 2,361,344
ReLU-38 [[1, 2048]] [1, 2048] 0
Dropout-9 [[1, 2048]] [1, 2048] 0
Linear-14 [[1, 2048]] [1, 2048] 4,196,352
ReLU-39 [[1, 2048]] [1, 2048] 0
Dropout-10 [[1, 2048]] [1, 2048] 0
Linear-15 [[1, 2048]] [1, 10] 20,490
===========================================================================
Total params: 7,929,674
Trainable params: 7,929,674
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.19
Forward/backward pass size (MB): 1.90
Params size (MB): 30.25
Estimated Total Size (MB): 32.34
---------------------------------------------------------------------------
{'total_params': 7929674, 'trainable_params': 7929674}
3.2 训练
learning_rate = 0.001
n_epochs = 50
paddle.seed(42)
np.random.seed(42)
work_path = 'work/model1'
model = AlexNet(num_classes=10)
criterion = LabelSmoothingCrossEntropy()
scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=learning_rate, T_max=50000 // batch_size * n_epochs, verbose=False)
optimizer = paddle.optimizer.Adam(parameters=model.parameters(), learning_rate=scheduler, weight_decay=1e-5)
gate = 0.0
threshold = 0.0
best_acc = 0.0
val_acc = 0.0
loss_record1 = {'train': {'loss': [], 'iter': []}, 'val': {'loss': [], 'iter': []}} # for recording loss
acc_record1 = {'train': {'acc': [], 'iter': []}, 'val': {'acc': [], 'iter': []}} # for recording accuracy
loss_iter = 0
acc_iter = 0
for epoch in range(n_epochs):
# ---------- Training ----------
model.train()
train_num = 0.0
train_loss = 0.0
val_num = 0.0
val_loss = 0.0
accuracy_manager = paddle.metric.Accuracy()
val_accuracy_manager = paddle.metric.Accuracy()
print("#===epoch: {}, lr={:.10f}===#".format(epoch, optimizer.get_lr()))
for batch_id, data in enumerate(train_loader):
x_data, y_data = data
labels = paddle.unsqueeze(y_data, axis=1)
logits = model(x_data)
loss = criterion(logits, y_data)
acc = paddle.metric.accuracy(logits, labels)
accuracy_manager.update(acc)
if batch_id % 10 == 0:
loss_record1['train']['loss'].append(loss.numpy())
loss_record1['train']['iter'].append(loss_iter)
loss_iter += 1
loss.backward()
optimizer.step()
scheduler.step()
optimizer.clear_grad()
train_loss += loss
train_num += len(y_data)
total_train_loss = (train_loss / train_num) * batch_size
train_acc = accuracy_manager.accumulate()
acc_record1['train']['acc'].append(train_acc)
acc_record1['train']['iter'].append(acc_iter)
acc_iter += 1
# Print the information.
print("#===epoch: {}, train loss is: {}, train acc is: {:2.2f}%===#".format(epoch, total_train_loss.numpy(), train_acc*100))
# ---------- Validation ----------
model.eval()
for batch_id, data in enumerate(val_loader):
x_data, y_data = data
labels = paddle.unsqueeze(y_data, axis=1)
with paddle.no_grad():
logits = model(x_data)
loss = criterion(logits, y_data)
acc = paddle.metric.accuracy(logits, labels)
val_accuracy_manager.update(acc)
val_loss += loss
val_num += len(y_data)
total_val_loss = (val_loss / val_num) * batch_size
loss_record1['val']['loss'].append(total_val_loss.numpy())
loss_record1['val']['iter'].append(loss_iter)
val_acc = val_accuracy_manager.accumulate()
acc_record1['val']['acc'].append(val_acc)
acc_record1['val']['iter'].append(acc_iter)
print("#===epoch: {}, val loss is: {}, val acc is: {:2.2f}%===#".format(epoch, total_val_loss.numpy(), val_acc*100))
# ===================save====================
if val_acc > best_acc:
best_acc = val_acc
paddle.save(model.state_dict(), os.path.join(work_path, 'best_model.pdparams'))
paddle.save(optimizer.state_dict(), os.path.join(work_path, 'best_optimizer.pdopt'))
print(best_acc)
paddle.save(model.state_dict(), os.path.join(work_path, 'final_model.pdparams'))
paddle.save(optimizer.state_dict(), os.path.join(work_path, 'final_optimizer.pdopt'))
#===epoch: 0, lr=0.0010000000===#
#===epoch: 0, train loss is: [1.9650538], train acc is: 31.60%===#
#===epoch: 0, val loss is: [1.6305186], val acc is: 48.43%===#
#===epoch: 1, lr=0.0009990134===#
#===epoch: 1, train loss is: [1.7221233], train acc is: 43.71%===#
#===epoch: 1, val loss is: [1.6270477], val acc is: 49.51%===#
#===epoch: 2, lr=0.0009960574===#
#===epoch: 2, train loss is: [1.6308397], train acc is: 48.63%===#
#===epoch: 2, val loss is: [1.4422747], val acc is: 57.20%===#
#===epoch: 3, lr=0.0009911436===#
#===epoch: 3, train loss is: [1.5810406], train acc is: 51.07%===#
#===epoch: 3, val loss is: [1.4248943], val acc is: 59.70%===#
#===epoch: 4, lr=0.0009842916===#
#===epoch: 4, train loss is: [1.5372194], train acc is: 53.51%===#
#===epoch: 4, val loss is: [1.4051503], val acc is: 59.93%===#
#===epoch: 5, lr=0.0009755283===#
#===epoch: 5, train loss is: [1.5010643], train acc is: 55.45%===#
#===epoch: 5, val loss is: [1.3973494], val acc is: 60.99%===#
#===epoch: 6, lr=0.0009648882===#
#===epoch: 6, train loss is: [1.4671948], train acc is: 57.16%===#
#===epoch: 6, val loss is: [1.3499789], val acc is: 62.58%===#
#===epoch: 7, lr=0.0009524135===#
#===epoch: 7, train loss is: [1.4449805], train acc is: 57.94%===#
#===epoch: 7, val loss is: [1.3680706], val acc is: 62.40%===#
#===epoch: 8, lr=0.0009381533===#
#===epoch: 8, train loss is: [1.4209102], train acc is: 59.30%===#
#===epoch: 8, val loss is: [1.2962121], val acc is: 65.51%===#
#===epoch: 9, lr=0.0009221640===#
#===epoch: 9, train loss is: [1.408174], train acc is: 60.22%===#
#===epoch: 9, val loss is: [1.31671], val acc is: 64.29%===#
#===epoch: 10, lr=0.0009045085===#
#===epoch: 10, train loss is: [1.3806214], train acc is: 61.43%===#
#===epoch: 10, val loss is: [1.2446662], val acc is: 67.59%===#
#===epoch: 11, lr=0.0008852566===#
#===epoch: 11, train loss is: [1.3622785], train acc is: 62.25%===#
#===epoch: 11, val loss is: [1.2531981], val acc is: 67.60%===#
#===epoch: 12, lr=0.0008644843===#
#===epoch: 12, train loss is: [1.3495497], train acc is: 63.01%===#
#===epoch: 12, val loss is: [1.2252071], val acc is: 68.14%===#
#===epoch: 13, lr=0.0008422736===#
#===epoch: 13, train loss is: [1.3302865], train acc is: 63.91%===#
#===epoch: 13, val loss is: [1.2246354], val acc is: 68.83%===#
#===epoch: 14, lr=0.0008187120===#
#===epoch: 14, train loss is: [1.325365], train acc is: 64.09%===#
#===epoch: 14, val loss is: [1.1886824], val acc is: 70.48%===#
#===epoch: 15, lr=0.0007938926===#
#===epoch: 15, train loss is: [1.3083125], train acc is: 64.63%===#
#===epoch: 15, val loss is: [1.2410982], val acc is: 67.76%===#
#===epoch: 16, lr=0.0007679134===#
#===epoch: 16, train loss is: [1.2942201], train acc is: 65.51%===#
#===epoch: 16, val loss is: [1.1892152], val acc is: 70.74%===#
#===epoch: 17, lr=0.0007408768===#
#===epoch: 17, train loss is: [1.284402], train acc is: 66.05%===#
#===epoch: 17, val loss is: [1.2043357], val acc is: 70.05%===#
#===epoch: 18, lr=0.0007128896===#
#===epoch: 18, train loss is: [1.2684674], train acc is: 66.85%===#
#===epoch: 18, val loss is: [1.1422471], val acc is: 72.52%===#
#===epoch: 19, lr=0.0006840623===#
#===epoch: 19, train loss is: [1.2642417], train acc is: 67.08%===#
#===epoch: 19, val loss is: [1.1457285], val acc is: 72.66%===#
#===epoch: 20, lr=0.0006545085===#
#===epoch: 20, train loss is: [1.2530406], train acc is: 67.44%===#
#===epoch: 20, val loss is: [1.1435425], val acc is: 72.41%===#
#===epoch: 21, lr=0.0006243449===#
#===epoch: 21, train loss is: [1.230555], train acc is: 68.47%===#
#===epoch: 21, val loss is: [1.151703], val acc is: 72.39%===#
#===epoch: 22, lr=0.0005936907===#
#===epoch: 22, train loss is: [1.2243475], train acc is: 68.78%===#
#===epoch: 22, val loss is: [1.1317416], val acc is: 73.12%===#
#===epoch: 23, lr=0.0005626666===#
#===epoch: 23, train loss is: [1.2125044], train acc is: 69.28%===#
#===epoch: 23, val loss is: [1.131524], val acc is: 73.56%===#
#===epoch: 24, lr=0.0005313953===#
#===epoch: 24, train loss is: [1.1983484], train acc is: 70.06%===#
#===epoch: 24, val loss is: [1.1417092], val acc is: 73.41%===#
#===epoch: 25, lr=0.0005000000===#
#===epoch: 25, train loss is: [1.1918993], train acc is: 70.11%===#
#===epoch: 25, val loss is: [1.1028641], val acc is: 74.72%===#
#===epoch: 26, lr=0.0004686047===#
#===epoch: 26, train loss is: [1.1755028], train acc is: 70.75%===#
#===epoch: 26, val loss is: [1.0835562], val acc is: 75.52%===#
#===epoch: 27, lr=0.0004373334===#
#===epoch: 27, train loss is: [1.1704789], train acc is: 71.13%===#
#===epoch: 27, val loss is: [1.0854902], val acc is: 76.30%===#
#===epoch: 28, lr=0.0004063093===#
#===epoch: 28, train loss is: [1.1576461], train acc is: 71.59%===#
#===epoch: 28, val loss is: [1.0876684], val acc is: 75.67%===#
#===epoch: 29, lr=0.0003756551===#
#===epoch: 29, train loss is: [1.1432424], train acc is: 72.16%===#
#===epoch: 29, val loss is: [1.0748475], val acc is: 76.01%===#
#===epoch: 30, lr=0.0003454915===#
#===epoch: 30, train loss is: [1.1328856], train acc is: 72.69%===#
#===epoch: 30, val loss is: [1.0685778], val acc is: 76.72%===#
#===epoch: 31, lr=0.0003159377===#
#===epoch: 31, train loss is: [1.1225183], train acc is: 73.08%===#
#===epoch: 31, val loss is: [1.0607836], val acc is: 76.54%===#
#===epoch: 32, lr=0.0002871104===#
#===epoch: 32, train loss is: [1.114567], train acc is: 73.67%===#
#===epoch: 32, val loss is: [1.0464559], val acc is: 77.76%===#
#===epoch: 33, lr=0.0002591232===#
#===epoch: 33, train loss is: [1.1031892], train acc is: 74.00%===#
#===epoch: 33, val loss is: [1.0455275], val acc is: 77.51%===#
#===epoch: 34, lr=0.0002320866===#
#===epoch: 34, train loss is: [1.0884582], train acc is: 74.94%===#
#===epoch: 34, val loss is: [1.0408577], val acc is: 77.93%===#
#===epoch: 35, lr=0.0002061074===#
#===epoch: 35, train loss is: [1.0837501], train acc is: 74.68%===#
#===epoch: 35, val loss is: [1.0423734], val acc is: 77.95%===#
#===epoch: 36, lr=0.0001812880===#
#===epoch: 36, train loss is: [1.0759592], train acc is: 74.98%===#
#===epoch: 36, val loss is: [1.0235242], val acc is: 78.32%===#
#===epoch: 37, lr=0.0001577264===#
#===epoch: 37, train loss is: [1.0702893], train acc is: 75.37%===#
#===epoch: 37, val loss is: [1.016121], val acc is: 79.02%===#
#===epoch: 38, lr=0.0001355157===#
#===epoch: 38, train loss is: [1.0639284], train acc is: 75.69%===#
#===epoch: 38, val loss is: [1.0210428], val acc is: 78.49%===#
#===epoch: 39, lr=0.0001147434===#
#===epoch: 39, train loss is: [1.0564318], train acc is: 76.00%===#
#===epoch: 39, val loss is: [1.0193514], val acc is: 79.28%===#
#===epoch: 40, lr=0.0000954915===#
#===epoch: 40, train loss is: [1.051025], train acc is: 76.37%===#
#===epoch: 40, val loss is: [1.014566], val acc is: 79.20%===#
#===epoch: 41, lr=0.0000778360===#
#===epoch: 41, train loss is: [1.0399163], train acc is: 76.71%===#
#===epoch: 41, val loss is: [1.014955], val acc is: 78.92%===#
#===epoch: 42, lr=0.0000618467===#
#===epoch: 42, train loss is: [1.0356172], train acc is: 77.01%===#
#===epoch: 42, val loss is: [1.0119495], val acc is: 79.00%===#
#===epoch: 43, lr=0.0000475865===#
#===epoch: 43, train loss is: [1.0322536], train acc is: 77.08%===#
#===epoch: 43, val loss is: [1.0156872], val acc is: 79.09%===#
#===epoch: 44, lr=0.0000351118===#
#===epoch: 44, train loss is: [1.0303307], train acc is: 77.04%===#
#===epoch: 44, val loss is: [1.0064815], val acc is: 79.43%===#
#===epoch: 45, lr=0.0000244717===#
#===epoch: 45, train loss is: [1.02571], train acc is: 77.35%===#
#===epoch: 45, val loss is: [1.0124086], val acc is: 79.30%===#
#===epoch: 46, lr=0.0000157084===#
#===epoch: 46, train loss is: [1.0294893], train acc is: 77.29%===#
#===epoch: 46, val loss is: [1.0075635], val acc is: 79.39%===#
#===epoch: 47, lr=0.0000088564===#
#===epoch: 47, train loss is: [1.0229493], train acc is: 77.52%===#
#===epoch: 47, val loss is: [1.0071208], val acc is: 79.50%===#
#===epoch: 48, lr=0.0000039426===#
#===epoch: 48, train loss is: [1.022951], train acc is: 77.40%===#
#===epoch: 48, val loss is: [1.0082942], val acc is: 79.53%===#
#===epoch: 49, lr=0.0000009866===#
#===epoch: 49, train loss is: [1.0234195], train acc is: 77.50%===#
#===epoch: 49, val loss is: [1.0080563], val acc is: 79.57%===#
0.7956882911392406
3.3 实验结果
plot_learning_curve(loss_record1, title='loss', ylabel='CE Loss')
plot_learning_curve(acc_record1, title='acc', ylabel='Accuracy')
import time
work_path = 'work/model1'
model = AlexNet(num_classes=10)
model_state_dict = paddle.load(os.path.join(work_path, 'best_model.pdparams'))
model.set_state_dict(model_state_dict)
model.eval()
aa = time.time()
for batch_id, data in enumerate(val_loader):
x_data, y_data = data
labels = paddle.unsqueeze(y_data, axis=1)
with paddle.no_grad():
logits = model(x_data)
bb = time.time()
print("Throughout:{}".format(int(len(val_dataset)//(bb - aa))))
Throughout:1165
work_path = 'work/model1'
X, y = next(iter(DataLoader(val_dataset, batch_size=18)))
model = AlexNet(num_classes=10)
model_state_dict = paddle.load(os.path.join(work_path, 'best_model.pdparams'))
model.set_state_dict(model_state_dict)
model.eval()
logits = model(X)
y_pred = paddle.argmax(logits, -1)
X = paddle.transpose(X, [0, 2, 3, 1])
axes = show_images(X.reshape((18, 128, 128, 3)), 1, 18, pred=get_cifar10_labels(y_pred), gt=get_cifar10_labels(y))
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
4. 对比实验结果
model | Train Acc | Val Acc | parameter |
---|---|---|---|
AlexNet w/o MixConv | 0.7750 | 0.79569 | 7929674 |
AlexNet w MixConv | 0.8048 | 0.82417 | 7065418 |
总结
MixConv在减少参数(-864256)的同时大大加快了收敛速度以及精度(+0.02848)
此文仅为搬运,原作链接:https://aistudio.baidu.com/aistudio/projectdetail/4349384
更多推荐
所有评论(0)