锐单电子商城 , 一站式电子元器件采购平台!
  • 电话:400-990-0325

【AI达人特训营】基于ResNet50的NAM注意力机制论文复现

时间:2022-08-25 12:30:00 6074连接器

本项目起源于百度AI达人训练营。通过论文的领读分析和代码解读,论文精读和飞桨(PaddlePaddle)结合代码复现学习。

一、论文解读
摘要

本文提出了基于归一化的注意力模块(NAM),稀疏的权重惩罚可以降低特征不明显的权重,这使得这些权重在计算中更有效,同时保持相同的性能。ResNet和MobileNet与其它注意力方法相比,我们的方法可以达到更高的准确性。

1.介绍

近年来,注意机制非常流行,它可以帮助神经网络抑制通道或空间中不显著的特征。许多以前的研究都集中在如何通过注意算子获得显著特征上。这些方法成功地发现了特征的不同维度之间的相互信息量。然而,对权值的贡献因素缺乏考虑,这种贡献因素可以进一步抑制不显著的特征。因此,我们针对利用权值的贡献因素来提高注意力。我们使用了Batch Normalization缩放因子表示权值的重要性。这样可以避免SE,BAM和CBAM增加全连接层和卷积层。这样,我们就提出了一种基于归一化的注意力方式(NAM)

2.方法

我们提出的NAM我们采用级高效的注意机制CBAM通道注意力和空间注意力子模块被重新设计,NAM每个网络都可以嵌入block的最后。残差网络可以嵌入到残差结构的最后。我们使用了通道注意力子模块Batch Normalization缩放因子,如型子(1),反映了各通道变化的大小,也反映了通道的重要性。为什么这么说?可以理解,缩放因子是BN方差越大,通道变化越大,通道中包含的信息就越丰富,重要性就越大,而那些变化不大的通道单一,重要性就越小。

因此,通道注意力子模块如图1所示,类型(2),表示最终输出特征,γ它是每个通道的缩放因子。因此,可以获得每个通道的权重。如果空间中的每个像素采用相同的归一方法,则可以获得空间在这里插入图片描述
间注意力的权重,类型(3),称为像素集成。如图2所示,为了抑制不重要的特征,我们在损失函数中添加了正则化项,如(4)类型:

3.实验

我们将NAM和SE,BAM,CBAM,TAM在ResNet和MobileNet上,在CIFAR100数据集和ImageNet对比数据集,我们对每个注意机制都采用了相同的预处理和训练方法。比较结果表明,在CIFAR100上,单独使用NAM通道注意力或空间注意力可以超越其他方式。在ImageNet同时使用NAM通道注意力和空间注意力可以超越其他方法。
4.结论

我们提出了一个NAM该模块通过抑制不显著特征来提高效率。我们的实验表明,NAM在ResNet和MobileNet都提供了效率增益。我们正在对NAM详细分析了集成变化和超参数调整性能。我们还计划优化不同的模型压缩技术NAM,以提高其效率。在未来,我们将研究它对其他深度学习系统结构和应用程序的影响。

二、数据集介绍
CIFAR100个数据集有100个类别。每个类有600个大小32 × 32 32\times 3232×500张彩色图像作为训练集,100张作为测试集。每个图像都有fine_labels和coarse_labels两个标签分别代表图像的细粒度和粗粒度标签,对应于下图classes和superclass。也就是说,CIFAR100数据集是分层的。
三、基于ResNet50的cirfar100实验
3.1 导入和划分数据集
In [8]
import paddle
import paddle.vision.transforms as t

def data_process():
# 数据增强策略
transform_strategy = t.Compose([
t.ColorJitter(), 立即调整亮度、对比度等
t.RandomHorizontalFlip(), # 随机水平翻转
t.RandomVerticalFlip(), # 随机垂直翻转
t.ToTensor() # 转化为张量
])

# 加载训练数据集 train_dataset = paddle.vision.datasets.Cifar100(     mode='train',     transform=transform_strategy )  # 测试集采用与训练集相同的增强策略,检验模型的泛化能力 eval_dataset = paddle.vision.datasets.Cifar100(     mode='test',      transform=transform_strategy )  print训练样本数: str(len(train_dataset)), '| 测试集样本数: str(len(eval_dataset))) return train_dataset, eval_dataset 

train_dataset, eval_dataset = data_process() #获取数据
item 52/41261 […] - ETA: 1:25 - 2ms/item
Cache file /home/aistudio/.cache/paddle/dataset/cifar/cifar-100-python.tar.gz not found, downloading https://dataset.bj.bcebos.com/cifar/cifar-100-python.tar.gz
Begin to download
item 40921/41261 [============================>.] - ETA: 0s - 674us/item

Download finished
训练集样本数: 50000 | 测试集样本数: 10000
3.2 调用paddle API搭建resnet50
In [9]
model = paddle.Model(paddle.vision.models.resnet50(pretrained=False))
#模型可视化
model.summary((-1, 3, 32, 32))
W0623 11:42:45.144419 167 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0623 11:42:45.149612 167 gpu_context.cc:306] device: 0, cuDNN Version: 7.6.

Layer (type) Input Shape Output Shape Param #

 Conv2D-1         [[1, 3, 32, 32]]     [1, 64, 16, 16]         9,408      

BatchNorm2D-1 [[1, 64, 16, 16]] [1, 64, 16, 16] 256
ReLU-1 [[1, 64, 16, 16]] [1, 64, 16, 16] 0
MaxPool2D-1 [[1, 64, 16, 16]] [1, 64, 8, 8] 0
Conv2D-3 [[1, 64, 8, 8]] [1, 64, 8, 8] 4,096
BatchNorm2D-3 [[1, 64, 8, 8]] [1, 64, 8, 8] 256
ReLU-2 [[1, 256, 8, 8]] [1, 256, 8, 8] 0
Conv2D-4 [[1, 64, 8, 8]] [1, 64, 8, 8] 36,864
BatchNorm2D-4 [[1, 64, 8, 8]] [1, 64, 8, 8] 256
Conv2D-5 [[1, 64, 8, 8]] [1, 256, 8, 8] 16,384
BatchNorm2D-5 [[1, 256, 8, 8]] [1, 256, 8, 8] 1,024
Conv2D-2 [[1, 64, 8, 8]] [1, 256, 8, 8] 16,384
BatchNorm2D-2 [[1, 256, 8, 8]] [1, 256, 8, 8] 1,024
BottleneckBlock-1 [[1, 64, 8, 8]] [1, 256, 8, 8] 0
Conv2D-6 [[1, 256, 8, 8]] [1, 64, 8, 8] 16,384
BatchNorm2D-6 [[1, 64, 8, 8]] [1, 64, 8, 8] 256
ReLU-3 [[1, 256, 8, 8]] [1, 256, 8, 8] 0
Conv2D-7 [[1, 64, 8, 8]] [1, 64, 8, 8] 36,864
BatchNorm2D-7 [[1, 64, 8, 8]] [1, 64, 8, 8] 256
Conv2D-8 [[1, 64, 8, 8]] [1, 256, 8, 8] 16,384
BatchNorm2D-8 [[1, 256, 8, 8]] [1, 256, 8, 8] 1,024
BottleneckBlock-2 [[1, 256, 8, 8]] [1, 256, 8, 8] 0
Conv2D-9 [[1, 256, 8, 8]] [1, 64, 8, 8] 16,384
BatchNorm2D-9 [[1, 64, 8, 8]] [1, 64, 8, 8] 256
ReLU-4 [[1, 256, 8, 8]] [1, 256, 8, 8] 0
Conv2D-10 [[1, 64, 8, 8]] [1, 64, 8, 8] 36,864
BatchNorm2D-10 [[1, 64, 8, 8]] [1, 64, 8, 8] 256
Conv2D-11 [[1, 64, 8, 8]] [1, 256, 8, 8] 16,384
BatchNorm2D-11 [[1, 256, 8, 8]] [1, 256, 8, 8] 1,024
BottleneckBlock-3 [[1, 256, 8, 8]] [1, 256, 8, 8] 0
Conv2D-13 [[1, 256, 8, 8]] [1, 128, 8, 8] 32,768
BatchNorm2D-13 [[1, 128, 8, 8]] [1, 128, 8, 8] 512
ReLU-5 [[1, 512, 4, 4]] [1, 512, 4, 4] 0
Conv2D-14 [[1, 128, 8, 8]] [1, 128, 4, 4] 147,456
BatchNorm2D-14 [[1, 128, 4, 4]] [1, 128, 4, 4] 512
Conv2D-15 [[1, 128, 4, 4]] [1, 512, 4, 4] 65,536
BatchNorm2D-15 [[1, 512, 4, 4]] [1, 512, 4, 4] 2,048
Conv2D-12 [[1, 256, 8, 8]] [1, 512, 4, 4] 131,072
BatchNorm2D-12 [[1, 512, 4, 4]] [1, 512, 4, 4] 2,048
BottleneckBlock-4 [[1, 256, 8, 8]] [1, 512, 4, 4] 0
Conv2D-16 [[1, 512, 4, 4]] [1, 128, 4, 4] 65,536
BatchNorm2D-16 [[1, 128, 4, 4]] [1, 128, 4, 4] 512
ReLU-6 [[1, 512, 4, 4]] [1, 512, 4, 4] 0
Conv2D-17 [[1, 128, 4, 4]] [1, 128, 4, 4] 147,456
BatchNorm2D-17 [[1, 128, 4, 4]] [1, 128, 4, 4] 512
Conv2D-18 [[1, 128, 4, 4]] [1, 512, 4, 4] 65,536
BatchNorm2D-18 [[1, 512, 4, 4]] [1, 512, 4, 4] 2,048
BottleneckBlock-5 [[1, 512, 4, 4]] [1, 512, 4, 4] 0
Conv2D-19 [[1, 512, 4, 4]] [1, 128, 4, 4] 65,536
BatchNorm2D-19 [[1, 128, 4, 4]] [1, 128, 4, 4] 512
ReLU-7 [[1, 512, 4, 4]] [1, 512, 4, 4] 0
Conv2D-20 [[1, 128, 4, 4]] [1, 128, 4, 4] 147,456
BatchNorm2D-20 [[1, 128, 4, 4]] [1, 128, 4, 4] 512
Conv2D-21 [[1, 128, 4, 4]] [1, 512, 4, 4] 65,536
BatchNorm2D-21 [[1, 512, 4, 4]] [1, 512, 4, 4] 2,048
BottleneckBlock-6 [[1, 512, 4, 4]] [1, 512, 4, 4] 0
Conv2D-22 [[1, 512, 4, 4]] [1, 128, 4, 4] 65,536
BatchNorm2D-22 [[1, 128, 4, 4]] [1, 128, 4, 4] 512
ReLU-8 [[1, 512, 4, 4]] [1, 512, 4, 4] 0
Conv2D-23 [[1, 128, 4, 4]] [1, 128, 4, 4] 147,456
BatchNorm2D-23 [[1, 128, 4, 4]] [1, 128, 4, 4] 512
Conv2D-24 [[1, 128, 4, 4]] [1, 512, 4, 4] 65,536
BatchNorm2D-24 [[1, 512, 4, 4]] [1, 512, 4, 4] 2,048
BottleneckBlock-7 [[1, 512, 4, 4]] [1, 512, 4, 4] 0
Conv2D-26 [[1, 512, 4, 4]] [1, 256, 4, 4] 131,072
BatchNorm2D-26 [[1, 256, 4, 4]] [1, 256, 4, 4] 1,024
ReLU-9 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-27 [[1, 256, 4, 4]] [1, 256, 2, 2] 589,824
BatchNorm2D-27 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
Conv2D-28 [[1, 256, 2, 2]] [1, 1024, 2, 2] 262,144
BatchNorm2D-28 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 4,096
Conv2D-25 [[1, 512, 4, 4]] [1, 1024, 2, 2] 524,288
BatchNorm2D-25 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 4,096
BottleneckBlock-8 [[1, 512, 4, 4]] [1, 1024, 2, 2] 0
Conv2D-29 [[1, 1024, 2, 2]] [1, 256, 2, 2] 262,144
BatchNorm2D-29 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
ReLU-10 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-30 [[1, 256, 2, 2]] [1, 256, 2, 2] 589,824
BatchNorm2D-30 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
Conv2D-31 [[1, 256, 2, 2]] [1, 1024, 2, 2] 262,144
BatchNorm2D-31 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 4,096
BottleneckBlock-9 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-32 [[1, 1024, 2, 2]] [1, 256, 2, 2] 262,144
BatchNorm2D-32 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
ReLU-11 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-33 [[1, 256, 2, 2]] [1, 256, 2, 2] 589,824
BatchNorm2D-33 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
Conv2D-34 [[1, 256, 2, 2]] [1, 1024, 2, 2] 262,144
BatchNorm2D-34 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 4,096
BottleneckBlock-10 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-35 [[1, 1024, 2, 2]] [1, 256, 2, 2] 262,144
BatchNorm2D-35 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
ReLU-12 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-36 [[1, 256, 2, 2]] [1, 256, 2, 2] 589,824
BatchNorm2D-36 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
Conv2D-37 [[1, 256, 2, 2]] [1, 1024, 2, 2] 262,144
BatchNorm2D-37 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 4,096
BottleneckBlock-11 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-38 [[1, 1024, 2, 2]] [1, 256, 2, 2] 262,144
BatchNorm2D-38 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
ReLU-13 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-39 [[1, 256, 2, 2]] [1, 256, 2, 2] 589,824
BatchNorm2D-39 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
Conv2D-40 [[1, 256, 2, 2]] [1, 1024, 2, 2] 262,144
BatchNorm2D-40 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 4,096
BottleneckBlock-12 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-41 [[1, 1024, 2, 2]] [1, 256, 2, 2] 262,144
BatchNorm2D-41 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
ReLU-14 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-42 [[1, 256, 2, 2]] [1, 256, 2, 2] 589,824
BatchNorm2D-42 [[1, 256, 2, 2]] [1, 256, 2, 2] 1,024
Conv2D-43 [[1, 256, 2, 2]] [1, 1024, 2, 2] 262,144
BatchNorm2D-43 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 4,096
BottleneckBlock-13 [[1, 1024, 2, 2]] [1, 1024, 2, 2] 0
Conv2D-45 [[1, 1024, 2, 2]] [1, 512, 2, 2] 524,288
BatchNorm2D-45 [[1, 512, 2, 2]] [1, 512, 2, 2] 2,048
ReLU-15 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 0
Conv2D-46 [[1, 512, 2, 2]] [1, 512, 1, 1] 2,359,296
BatchNorm2D-46 [[1, 512, 1, 1]] [1, 512, 1, 1] 2,048
Conv2D-47 [[1, 512, 1, 1]] [1, 2048, 1, 1] 1,048,576
BatchNorm2D-47 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 8,192
Conv2D-44 [[1, 1024, 2, 2]] [1, 2048, 1, 1] 2,097,152
BatchNorm2D-44 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 8,192
BottleneckBlock-14 [[1, 1024, 2, 2]] [1, 2048, 1, 1] 0
Conv2D-48 [[1, 2048, 1, 1]] [1, 512, 1, 1] 1,048,576
BatchNorm2D-48 [[1, 512, 1, 1]] [1, 512, 1, 1] 2,048
ReLU-16 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 0
Conv2D-49 [[1, 512, 1, 1]] [1, 512, 1, 1] 2,359,296
BatchNorm2D-49 [[1, 512, 1, 1]] [1, 512, 1, 1] 2,048
Conv2D-50 [[1, 512, 1, 1]] [1, 2048, 1, 1] 1,048,576
BatchNorm2D-50 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 8,192
BottleneckBlock-15 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 0
Conv2D-51 [[1, 2048, 1, 1]] [1, 512, 1, 1] 1,048,576
BatchNorm2D-51 [[1, 512, 1, 1]] [1, 512, 1, 1] 2,048
ReLU-17 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 0
Conv2D-52 [[1, 512, 1, 1]] [1, 512, 1, 1] 2,359,296
BatchNorm2D-52 [[1, 512, 1, 1]] [1, 512, 1, 1] 2,048
Conv2D-53 [[1, 512, 1, 1]] [1, 2048, 1, 1] 1,048,576
BatchNorm2D-53 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 8,192
BottleneckBlock-16 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 0
AdaptiveAvgPool2D-1 [[1, 2048, 1, 1]] [1, 2048, 1, 1] 0
Linear-1 [[1, 2048]] [1, 1000] 2,049,000

Total params: 25,610,152
Trainable params: 25,503,912
Non-trainable params: 106,240

Input size (MB): 0.01
Forward/backward pass size (MB): 5.36
Params size (MB): 97.69
Estimated Total Size (MB): 103.07

{‘total_params’: 25610152, ‘trainable_params’: 25503912}
3.3 模型训练
In [12]
from paddle.optimizer.lr import CosineAnnealingDecay, MultiStepDecay, LinearWarmup

model.prepare(paddle.optimizer.SGD(learning_rate=0.001, parameters=model.parameters()),#使用Adam优化器,学习率为0.0001

paddle.nn.CrossEntropyLoss(),#损失函数使用交叉熵函数

paddle.metric.Accuracy())#Acc用top1与top5精准度表示

model.prepare(
paddle.optimizer.Momentum(
learning_rate=LinearWarmup(CosineAnnealingDecay(0.001, 100), 2000, 0., 0.001),
momentum=0.9,
parameters=model.parameters(),
weight_decay=5e-4),
paddle.nn.CrossEntropyLoss(),
paddle.metric.Accuracy(topk=(1,5)))

callback_visualdl = paddle.callbacks.VisualDL(log_dir=‘visualdl_log_dir’)
#开始模型训练
model.fit(train_dataset,
eval_dataset,
epochs=100,#训练的轮数
batch_size=128,#每次训练多少个
verbose=1,#显示模式
shuffle=True,#打乱数据集顺序
num_workers=4,
callbacks=callback_visualdl,
)
The loss value printed in the log is the current step, and the metric is the average value of previous steps.
Epoch 1/100
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:654: UserWarning: When training, we now always track global mean and variance.
“When training, we now always track global mean and variance.”)
step 391/391 [] - loss: 5.8216 - acc_top1: 0.0083 - acc_top5: 0.0365 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 5.1818 - acc_top1: 0.0139 - acc_top5: 0.0630 - 30ms/step
Eval samples: 10000
Epoch 2/100
step 391/391 [] - loss: 4.9121 - acc_top1: 0.0193 - acc_top5: 0.0816 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 4.3763 - acc_top1: 0.0310 - acc_top5: 0.1156 - 29ms/step
Eval samples: 10000
Epoch 3/100
step 391/391 [] - loss: 4.5889 - acc_top1: 0.0426 - acc_top5: 0.1608 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 4.3340 - acc_top1: 0.0639 - acc_top5: 0.2092 - 29ms/step
Eval samples: 10000
Epoch 4/100
step 391/391 [] - loss: 4.0951 - acc_top1: 0.0792 - acc_top5: 0.2558 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 4.4047 - acc_top1: 0.0998 - acc_top5: 0.2908 - 29ms/step
Eval samples: 10000
Epoch 5/100
step 391/391 [] - loss: 4.0470 - acc_top1: 0.1093 - acc_top5: 0.3159 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 4.1264 - acc_top1: 0.1306 - acc_top5: 0.3494 - 29ms/step
Eval samples: 10000
Epoch 6/100
step 391/391 [] - loss: 3.3129 - acc_top1: 0.1412 - acc_top5: 0.3718 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.7290 - acc_top1: 0.1552 - acc_top5: 0.3856 - 30ms/step
Eval samples: 10000
Epoch 7/100
step 391/391 [] - loss: 3.6535 - acc_top1: 0.1597 - acc_top5: 0.4010 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.6085 - acc_top1: 0.1640 - acc_top5: 0.4079 - 29ms/step
Eval samples: 10000
Epoch 8/100
step 391/391 [] - loss: 3.2932 - acc_top1: 0.1744 - acc_top5: 0.4265 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.4222 - acc_top1: 0.1731 - acc_top5: 0.4195 - 31ms/step
Eval samples: 10000
Epoch 9/100
step 391/391 [] - loss: 3.4369 - acc_top1: 0.1843 - acc_top5: 0.4422 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.2243 - acc_top1: 0.1849 - acc_top5: 0.4386 - 29ms/step
Eval samples: 10000
Epoch 10/100
step 391/391 [] - loss: 3.5167 - acc_top1: 0.1973 - acc_top5: 0.4603 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.5050 - acc_top1: 0.1892 - acc_top5: 0.4476 - 29ms/step
Eval samples: 10000
Epoch 11/100
step 391/391 [] - loss: 3.1014 - acc_top1: 0.2053 - acc_top5: 0.4726 - 52ms/step
Eval begin…
step 79/79 [
] - loss: 3.4008 - acc_top1: 0.1954 - acc_top5: 0.4587 - 29ms/step loss: 3.3302 - acc_top1: 0.1945 - acc_top5: 0.4608 - ETA: 0s - 32m
Eval samples: 10000
Epoch 12/100
step 391/391 [] - loss: 3.7199 - acc_top1: 0.2150 - acc_top5: 0.4872 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.4300 - acc_top1: 0.2004 - acc_top5: 0.4636 - 29ms/step loss: 3.0789 - acc_top1: 0.2027 - acc_top5: 0.4613 - ETA: 2s
Eval samples: 10000
Epoch 13/100
step 391/391 [] - loss: 3.4431 - acc_top1: 0.2238 - acc_top5: 0.4971 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.5822 - acc_top1: 0.2085 - acc_top5: 0.4721 - 38ms/step
Eval samples: 10000
Epoch 14/100
step 391/391 [] - loss: 2.7427 - acc_top1: 0.2316 - acc_top5: 0.5125 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.4022 - acc_top1: 0.2151 - acc_top5: 0.4802 - 29ms/step
Eval samples: 10000
Epoch 15/100
step 391/391 [] - loss: 3.3946 - acc_top1: 0.2384 - acc_top5: 0.5218 - 50ms/step
Eval begin…
step 79/79 [
] - loss: 3.3966 - acc_top1: 0.2138 - acc_top5: 0.4899 - 29ms/step
Eval samples: 10000
Epoch 16/100
step 391/391 [] - loss: 3.0503 - acc_top1: 0.2481 - acc_top5: 0.5355 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.5413 - acc_top1: 0.2195 - acc_top5: 0.4901 - 32ms/step
Eval samples: 10000
Epoch 17/100
step 391/391 [] - loss: 3.1893 - acc_top1: 0.2571 - acc_top5: 0.5454 - 50ms/step
Eval begin…
step 79/79 [
] - loss: 3.5470 - acc_top1: 0.2124 - acc_top5: 0.4812 - 29ms/step
Eval samples: 10000
Epoch 18/100
step 391/391 [] - loss: 2.9919 - acc_top1: 0.2657 - acc_top5: 0.5556 - 50ms/step
Eval begin…
step 79/79 [
] - loss: 3.6241 - acc_top1: 0.2206 - acc_top5: 0.4934 - 29ms/step
Eval samples: 10000
Epoch 19/100
step 391/391 [] - loss: 2.9012 - acc_top1: 0.2698 - acc_top5: 0.5641 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.2880 - acc_top1: 0.2222 - acc_top5: 0.4976 - 29ms/step
Eval samples: 10000
Epoch 20/100
step 391/391 [] - loss: 2.8051 - acc_top1: 0.2770 - acc_top5: 0.5716 - 50ms/step
Eval begin…
step 79/79 [
] - loss: 3.3445 - acc_top1: 0.2281 - acc_top5: 0.5019 - 29ms/step
Eval samples: 10000
Epoch 21/100
step 391/391 [] - loss: 2.8484 - acc_top1: 0.2824 - acc_top5: 0.5814 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.1611 - acc_top1: 0.2339 - acc_top5: 0.5085 - 29ms/step
Eval samples: 10000
Epoch 22/100
step 391/391 [] - loss: 2.6754 - acc_top1: 0.2907 - acc_top5: 0.5896 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.0240 - acc_top1: 0.2327 - acc_top5: 0.5114 - 30ms/step
Eval samples: 10000
Epoch 23/100
step 391/391 [] - loss: 3.2871 - acc_top1: 0.2964 - acc_top5: 0.5974 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.1273 - acc_top1: 0.2442 - acc_top5: 0.5140 - 30ms/step
Eval samples: 10000
Epoch 24/100
step 391/391 [] - loss: 3.0351 - acc_top1: 0.3056 - acc_top5: 0.6029 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.3547 - acc_top1: 0.2479 - acc_top5: 0.5299 - 29ms/step
Eval samples: 10000
Epoch 25/100
step 391/391 [] - loss: 2.7127 - acc_top1: 0.3093 - acc_top5: 0.6148 - 52ms/step
Eval begin…
step 79/79 [
] - loss: 3.2127 - acc_top1: 0.2506 - acc_top5: 0.5291 - 30ms/step
Eval samples: 10000
Epoch 26/100
step 391/391 [] - loss: 2.8931 - acc_top1: 0.3145 - acc_top5: 0.6212 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.9948 - acc_top1: 0.2512 - acc_top5: 0.5316 - 30ms/step
Eval samples: 10000
Epoch 27/100
step 391/391 [] - loss: 3.0113 - acc_top1: 0.3210 - acc_top5: 0.6290 - 52ms/step
Eval begin…
step 79/79 [
] - loss: 3.1367 - acc_top1: 0.2537 - acc_top5: 0.5331 - 37ms/step
Eval samples: 10000
Epoch 28/100
step 391/391 [] - loss: 2.8563 - acc_top1: 0.3280 - acc_top5: 0.6368 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.9553 - acc_top1: 0.2667 - acc_top5: 0.5432 - 29ms/step
Eval samples: 10000
Epoch 29/100
step 391/391 [] - loss: 2.3831 - acc_top1: 0.3327 - acc_top5: 0.6425 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.8827 - acc_top1: 0.2671 - acc_top5: 0.5509 - 30ms/step
Eval samples: 10000
Epoch 30/100
step 391/391 [] - loss: 2.8241 - acc_top1: 0.3421 - acc_top5: 0.6498 - 52ms/step
Eval begin…
step 79/79 [
] - loss: 3.1073 - acc_top1: 0.2667 - acc_top5: 0.5525 - 30ms/step
Eval samples: 10000
Epoch 31/100
step 391/391 [] - loss: 2.5978 - acc_top1: 0.3466 - acc_top5: 0.6570 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.1769 - acc_top1: 0.2712 - acc_top5: 0.5555 - 30ms/step
Eval samples: 10000
Epoch 32/100
step 391/391 [] - loss: 2.6507 - acc_top1: 0.3562 - acc_top5: 0.6653 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.0157 - acc_top1: 0.2752 - acc_top5: 0.5587 - 30ms/step
Eval samples: 10000
Epoch 33/100
step 391/391 [] - loss: 3.0660 - acc_top1: 0.3567 - acc_top5: 0.6712 - 52ms/step
Eval begin…
step 79/79 [
] - loss: 3.0034 - acc_top1: 0.2762 - acc_top5: 0.5559 - 30ms/step
Eval samples: 10000
Epoch 34/100
step 391/391 [] - loss: 2.4485 - acc_top1: 0.3656 - acc_top5: 0.6777 - 52ms/step
Eval begin…
step 79/79 [
] - loss: 2.9226 - acc_top1: 0.2772 - acc_top5: 0.5563 - 29ms/step
Eval samples: 10000
Epoch 35/100
step 391/391 [] - loss: 2.7170 - acc_top1: 0.3770 - acc_top5: 0.6829 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.1584 - acc_top1: 0.2801 - acc_top5: 0.5603 - 29ms/step
Eval samples: 10000
Epoch 36/100
step 391/391 [] - loss: 2.3328 - acc_top1: 0.3813 - acc_top5: 0.6928 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.8032 - acc_top1: 0.2761 - acc_top5: 0.5617 - 30ms/step
Eval samples: 10000
Epoch 37/100
step 391/391 [] - loss: 2.3197 - acc_top1: 0.3866 - acc_top5: 0.6992 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.9066 - acc_top1: 0.2725 - acc_top5: 0.5597 - 30ms/step
Eval samples: 10000
Epoch 38/100
step 391/391 [] - loss: 2.4766 - acc_top1: 0.3977 - acc_top5: 0.7053 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.7028 - acc_top1: 0.2792 - acc_top5: 0.5652 - 29ms/step
Eval samples: 10000
Epoch 39/100
step 391/391 [] - loss: 2.4647 - acc_top1: 0.4030 - acc_top5: 0.7120 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.8679 - acc_top1: 0.2783 - acc_top5: 0.5623 - 30ms/step
Eval samples: 10000
Epoch 40/100
step 391/391 [] - loss: 2.3726 - acc_top1: 0.4081 - acc_top5: 0.7183 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.6993 - acc_top1: 0.2803 - acc_top5: 0.5579 - 30ms/step
Eval samples: 10000
Epoch 41/100
step 391/391 [] - loss: 2.3839 - acc_top1: 0.4168 - acc_top5: 0.7266 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.0042 - acc_top1: 0.2769 - acc_top5: 0.5667 - 33ms/step
Eval samples: 10000
Epoch 42/100
step 391/391 [] - loss: 2.2273 - acc_top1: 0.4213 - acc_top5: 0.7319 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.9708 - acc_top1: 0.2747 - acc_top5: 0.5610 - 29ms/step
Eval samples: 10000
Epoch 43/100
step 391/391 [] - loss: 2.3523 - acc_top1: 0.4268 - acc_top5: 0.7353 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.5551 - acc_top1: 0.2726 - acc_top5: 0.5633 - 29ms/step
Eval samples: 10000
Epoch 44/100
step 391/391 [] - loss: 2.4685 - acc_top1: 0.4367 - acc_top5: 0.7454 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.4999 - acc_top1: 0.2706 - acc_top5: 0.5608 - 29ms/step
Eval samples: 10000
Epoch 45/100
step 391/391 [] - loss: 2.3961 - acc_top1: 0.4377 - acc_top5: 0.7449 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 3.3164 - acc_top1: 0.2735 - acc_top5: 0.5627 - 32ms/step
Eval samples: 10000
Epoch 46/100
step 391/391 [] - loss: 2.4653 - acc_top1: 0.4486 - acc_top5: 0.7549 - 51ms/step
Eval begin…
step 79/79 [
] - loss: 2.9869 - acc_top1: 0.2840 - acc_top5: 0.5694 - 29ms/step
Eval samples: 10000
Epo

相关文章