目录链接:吴恩达Deep Learning学习笔记目录
1.Tensorflow2.0 tf.keras 入门
2.残差网络
1. Tensorflow2.0 tf.keras 入门
通过Tensorflow2.0 tf.keras框架搭建模型来识别图片是否是笑脸(笑脸标签1,非笑脸标签0)。搭建模型一共三种方法:
①通过tf.keras.models.Sequenctial()来搭建
②通过tf.keras中的function API来搭建
③通过tf.keras.Model的子类来搭建
(1)包引入及数据描述
import os
# os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import pandas as pd
import sklearn
import sys
import time
import tensorflow as tf
from tensorflow import keras
import pprint
print(tf.__version__)
print(sys.version_info)
for module in mpl,np,pd,sklearn,tf,keras:
print(module.__name__,module.__version__)
#设置逻辑GPU,及限制GPU内存(笔记本GTX1060,一个显卡,不限制内存,内存占满会非常卡,而且会崩)
tf.debugging.set_log_device_placement(True)
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_virtual_device_configuration(
gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024),
tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
print(len(gpus))
logical_gpu = tf.config.experimental.list_logical_devices('GPU')
print(len(logical_gpu))
数据描述:
import kt_utils
X_train,Y_train,X_test,Y_test,classes = kt_utils.load_dataset()
Y_train = Y_train.T
Y_test = Y_test.T
print("num of training samples:" ,X_train.shape[0])
print("num of test samples:" ,X_test.shape[0])
print("X_train shape:",X_train.shape)
print("Y_train shape:",Y_train.shape)
print("X_test shape:",X_test.shape)
print("Y_test shape:",Y_test.shape)
"""
输出:
num of training samples: 600
num of test samples: 150
X_train shape: (600, 64, 64, 3)
Y_train shape: (600, 1)
X_test shape: (150, 64, 64, 3)
Y_test shape: (150, 1)
"""
数据绘图:
def show_imgs(n_rows,n_cols,x_data,y_data):
assert len(x_data) == len(y_data)
assert n_rows * n_cols < len(x_data)
plt.figure(figsize = (n_cols*1.4,n_rows*1.6))
for row in range(n_rows):
for col in range(n_cols):
index = n_cols * row + col
plt.subplot(n_rows,n_cols,index+1)
plt.imshow(x_data[index],cmap = 'binary',interpolation='nearest')
plt.axis('off')
plt.title(y_data[index])
plt.show()
show_imgs(3,5,X_train,Y_train)
归一化
X_train = X_train / 255
X_test = X_test / 255
(2)通过tf.keras.models.Sequenctial()来搭建
由于版本的更新,tf.keras.models.Sequenctial()有很多Aliases:
Class tf.compat.v1.keras.Sequential
Class tf.compat.v1.keras.models.Sequential
Class tf.compat.v2.keras.Sequential
Class tf.compat.v2.keras.models.Sequential
Class tf.keras.models.Sequential
tf.keras.models.Sequenctial()用于构建神经网络模型,其参数是一个含有各个计算layer的列表。tf.keras.layers中有很多layer类,如卷积、池化、全连接、归一化等函数,通过这些类来构建每一层计算:
model = keras.models.Sequential([
#卷积层:卷积→BN→激活
keras.layers.Conv2D(filters = 32,kernel_size = 7,padding = "same",input_shape = (64,64,3)),
keras.layers.BatchNormalization(),
keras.layers.Activation("relu"),
#池化
keras.layers.MaxPool2D(pool_size = 2),
#全连接层
keras.layers.Flatten(),
keras.layers.Dense(1,activation = "sigmoid"),
])
model.summary()#用于描述搭建的模型
解释:
第一步:卷积,前两个参数分别为卷积核数量,卷积核大小
第二步:批归一化
第三步:relu激活
第四步:池化,池化核大小为2。以上构成网络的第一层(由于数据太小及配置问题,不再添加卷积层)
第五步:将数据展平(全连接层)
第六步:输出层(二分类,输出单元为1,激活函数为sigmoid)
模型结构:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_9 (Conv2D) (None, 64, 64, 32) 4736
_________________________________________________________________
batch_normalization_9 (Batch (None, 64, 64, 32) 128
_________________________________________________________________
activation_9 (Activation) (None, 64, 64, 32) 0
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 32, 32, 32) 0
_________________________________________________________________
flatten_6 (Flatten) (None, 32768) 0
_________________________________________________________________
dense_12 (Dense) (None, 1) 32769
=================================================================
Total params: 37,633
Trainable params: 37,569
Non-trainable params: 64
编译及训练:
model.compile(loss="binary_crossentropy",
optimizer = "sgd",
metrics = ["acc"])
his = model.fit(X_train,Y_train,epochs=40,validation_data=(X_test,Y_test),batch_size=50)
训练结果:
def plot_learning_curves(history):
pd.DataFrame(history.history).plot(figsize=(8,5))
plt.grid(True)
plt.gca().set_ylim(0,5)
plt.show()
plot_learning_curves(his)
(3)通过tf.keras中的function API来搭建
tf.keras中不仅含有各种类,还含有一些函数如Input()用于初始化keras tensor,concatenate()用于连接层。同时,tf.keras中各种类在这里将被当作函数API来使用:
def my_model_1(start_shape):
input_start = keras.Input(start_shape)
conv = keras.layers.Conv2D(filters = 32,kernel_size = 7,padding = "same")(input_start)
bn = keras.layers.BatchNormalization(axis=3)(conv)
acti = keras.layers.Activation("relu")(bn)
pool_max_1 = keras.layers.MaxPool2D(pool_size=2)(acti)
fla = keras.layers.Flatten()(pool_max_1)
output_end = keras.layers.Dense(1,activation="sigmoid")(fla)
model = keras.Model(inputs=input_start,outputs=output_end)
return model
model_1 = my_model_1(X_train.shape[1:])
model_1.compile(loss="binary_crossentropy",
optimizer = "sgd",
metrics = ["acc"])
his1 = model_1.fit(X_train,Y_train,epochs=40,validation_data=(X_test,Y_test),batch_size=50)
解释:
第一步:采用此方法来构建模型,需要第一步用Input()来初始化keras tensor
后续:结构与(2)相同,但在此,每一步的输出将作为下一步函数的参数。
(4)通过tf.keras.Model的子类来搭建
通过创建一个类(继承于tf.keras.Model类)来构建模型,init函数用于定义layer,call函数用于模型的前向计算过程:
class My_model(tf.keras.Model):
def __init__(self):
super(My_model,self).__init__()
self.hidden_conv = tf.keras.layers.Conv2D(filters = 32,kernel_size = 7,padding = "same")
self.hidden_bn = tf.keras.layers.BatchNormalization()
self.hidden_ac = tf.keras.layers.Activation("relu")
self.max_pool_1 = tf.keras.layers.MaxPool2D(pool_size = 2)
self.fla = tf.keras.layers.Flatten()
self.out_put = tf.keras.layers.Dense(1,activation="sigmoid")
def call(self,input_start):
h_conv = self.hidden_conv(input_start)
h_bn = self.hidden_bn(h_conv)
h_ac = self.hidden_ac(h_bn)
m_p_1 = self.max_pool_1(h_ac)
f_la = self.fla(m_p_1)
return self.out_put(f_la)
# model2 = keras.models.Sequential([My_model()])#等同下行
model2 = My_model()
model2.build(input_shape = (None,64,64,3))
model2.compile(loss="binary_crossentropy",
optimizer = "sgd",
metrics = ["acc"])
his2 = model2.fit(X_train,Y_train,epochs=40,validation_data=(X_test,Y_test),batch_size=50)
解释:由于子类实现的模型不知道数据输入的shape,通过model2.build()接收到的输入来构建模型。
注:
tf.keras文档见Module: tf.keras
(4)测试模型
从网上随便找张人脸图片用于测试,下载下来的图片需要经过如下处理:
img = keras.preprocessing.image.load_img('./datasets/smile.jpg',target_size=(64,64))
plt.imshow(img)
x = keras.preprocessing.image.img_to_array(img)
x = np.expand_dims(x,axis=0)
print(x.shape)
"""
输出:(1, 64, 64, 3)
"""
print(model2.predict(x))
"""
输出:[[1]]
"""
2. 残差网络
2.1手动搭建残差网络
神经网络用于大规模数据进行训练时,效果良好,理论上,随着网络结构的深入,所训练出的模型的性能应该是越来越好的,但实际上并不是,在达到一定layer后,模型性能开始变差。其中一个原因是随着网络越深入,梯度消失问题变得严重(激活函数改善、批归一化仅仅是削弱了梯度消失的问题,并没有完全解决神经网络梯度消失和梯度爆炸的问题)。
残差网络允许原始输入直接输入到后面的层,这相当于直接用后面的层来学习所期望输出和输入的差值,理论上它至少不会比输入更差。参见深度学习——残差神经网络ResNet在分别在Keras和tensorflow框架下的应用案例。
论文中,在每两层3x3的卷积采用”远跳连接“,但也提到了三层模型,先采用1x1卷积降维,通过中间3x3卷积后,再通过1x1卷积升维。①在维度相同的卷积层之间采用”恒等连接“,直接将X加和到后面层的Z即可;②对于不同维度层之间,需要采用卷积核来对原始输入进行调整。
(1)恒等连接
恒等连接在原始输入数据高、宽与后面连接层输出的高、宽相同时使用,如下图:
def identity_block(X,f,kernels):
f1,f2,f3 = kernels
#保存原始输入
X_short_cut = X
#1x1卷积通道降维
conv1 = keras.layers.Conv2D(filters=f1,kernel_size=(1,1),strides=(1,1),
padding="valid",kernel_initializer=keras.initializers.glorot_uniform(seed=0))(X)
bn1 = keras.layers.BatchNormalization(axis=3)(conv1)
relu1 = keras.layers.Activation("relu")(bn1)
#3x3卷积,保持高宽不变
conv2 = keras.layers.Conv2D(filters=f2,kernel_size=(f,f),strides=(1,1),
padding="same",kernel_initializer=keras.initializers.glorot_uniform(seed=0))(relu1)
bn2 = keras.layers.BatchNormalization(axis=3)(conv2)
relu2 = keras.layers.Activation("relu")(bn2)
#1x1卷积,通道升维
conv3 = keras.layers.Conv2D(filters=f3,kernel_size=(1,1),strides=(1,1),
padding="valid",kernel_initializer=keras.initializers.glorot_uniform(seed=0))(relu2)
bn3 = keras.layers.BatchNormalization(axis=3)(conv3)
Z = keras.layers.Add()([bn3,X_short_cut])
relu3 = keras.layers.Activation("relu")(Z)
return relu3
注:在第三层升维度时,原始输入要并入到第三层的归一化数据里,当作Z输入激活函数来激活。
(2)卷积连接
当原始输入和输出的高宽不一致时,远跳连接不能直接加和到输出层,需要通过卷积来调整维度,使得维度和输出层相同:
def conv_block(X,f,kernels,s=2):
f1,f2,f3 = kernels
X_short_cut = X
conv1 = keras.layers.Conv2D(filters=f1,kernel_size=(1,1),strides=(s,s),
padding="valid",kernel_initializer=keras.initializers.glorot_uniform(seed=0))(X)
bn1 = keras.layers.BatchNormalization(axis=3)(conv1)
relu1 = keras.layers.Activation("relu")(bn1)
conv2 = keras.layers.Conv2D(filters=f2,kernel_size=(f,f),strides=(1,1),
padding="same",kernel_initializer=keras.initializers.glorot_uniform(seed=0))(relu1)
bn2 = keras.layers.BatchNormalization(axis=3)(conv2)
relu2 = keras.layers.Activation("relu")(bn2)
conv3 = keras.layers.Conv2D(filters=f3,kernel_size=(1,1),strides=(1,1),
padding="valid",kernel_initializer=keras.initializers.glorot_uniform(seed=0))(relu2)
bn3 = keras.layers.BatchNormalization(axis=3)(conv3)
X_short_cut_conv = keras.layers.Conv2D(filters=f3,kernel_size=(1,1),strides=(s,s),
padding="valid",kernel_initializer=keras.initializers.glorot_uniform(seed=0))(X_short_cut)
X_short_cut_bn = keras.layers.BatchNormalization(axis=3)(X_short_cut_conv)
Z = keras.layers.Add()([bn3,X_short_cut_bn])
relu3 = keras.layers.Activation("relu")(Z)
return relu3
(2)模型构建
ResNet50模型结构如下:
def my_resnet(input_shape,classes):
start_input = keras.Input(input_shape)
start_input_zero_padding = keras.layers.ZeroPadding2D((3,3))(start_input)
stage_1_conv = keras.layers.Conv2D(64,kernel_size=(7,7),strides=(2,2),
kernel_initializer=keras.initializers.glorot_uniform(seed=0))(start_input_zero_padding)
stage_1_bn = keras.layers.BatchNormalization(axis=3)(stage_1_conv)
stage_1_relu = keras.layers.Activation("relu")(stage_1_bn)
stage_1_pooling = keras.layers.MaxPool2D(pool_size=(3,3),strides=(2,2))(stage_1_relu)
stage_2_conv = conv_block(stage_1_pooling,f=3,kernels=[64,64,256],s=1)
stage_2_idblock_1 = identity_block(stage_2_conv,f=3,kernels=[64,64,256])
stage_2_idblock_2 = identity_block(stage_2_idblock_1,f=3,kernels=[64,64,256])
stage_3_conv = conv_block(stage_2_idblock_2,f=3,kernels=[128,128,512],s=2)
stage_3_idblock_1 = identity_block(stage_3_conv,f=3,kernels=[128,128,512])
stage_3_idblock_2 = identity_block(stage_3_idblock_1,f=3,kernels=[128,128,512])
stage_3_idblock_3 = identity_block(stage_3_idblock_2,f=3,kernels=[128,128,512])
stage_4_conv = conv_block(stage_3_idblock_3,f=3,kernels=[256,256,1024],s=2)
stage_4_idblock_1 = identity_block(stage_4_conv,f=3,kernels=[256,256,1024])
stage_4_idblock_2 = identity_block(stage_4_idblock_1,f=3,kernels=[256,256,1024])
stage_4_idblock_3 = identity_block(stage_4_idblock_2,f=3,kernels=[256,256,1024])
stage_4_idblock_4 = identity_block(stage_4_idblock_3,f=3,kernels=[256,256,1024])
stage_4_idblock_5 = identity_block(stage_4_idblock_4,f=3,kernels=[256,256,1024])
stage_5_conv = conv_block(stage_4_idblock_5,f=3,kernels=[512,512,2048],s=2)
stage_5_idblock_1 = identity_block(stage_5_conv,f=3,kernels=[256,256,2048])
stage_5_idblock_2 = identity_block(stage_5_idblock_1,f=3,kernels=[256,256,2048])
average_pooling = keras.layers.AveragePooling2D(pool_size=(2,2))(stage_5_idblock_2)
fla = keras.layers.Flatten()(average_pooling)
output_end = keras.layers.Dense(classes,activation="softmax",kernel_initializer=keras.initializers.glorot_uniform(seed=0))(fla)
model = keras.Model(inputs = start_input,outputs = output_end)
return model
在ResNet50中,默认的输入大小是(224,224,3),本文采用的是GPU来训练,奈何GTX1060显存不够用,如果直接用(224,224,3)的图片来训练(batch_size=24),会导致显存不够报错:
ResourceExhaustedError: OOM when allocating tensor with shape[24,256,55,55]
减少batch_size至12还是报错,所以在训练时使用的图片大小为(64,64,3),batch_size=12。此处采用的数据是10 Monkey Species,下载下来的数据按类别分别保存在不同的文件夹,这里采用keras.preprocessing.image.ImageDataGenetor()来作为数据输入管道,官方API文档见tf.keras.preprocessing.image.ImageDataGenerator,各参数的作用在Keras ImageDataGenerator参数中进行了说明,对各个参数进行了单独测试。
数据文件路径:
train_dir = "./input/training/training"
valid_dir = "./input/validation/validation"
labels_file = "./input/monkey_labels.txt"
labels = pd.read_csv(labels_file)
print(labels)
"""
输出:
Label Latin Name Common Name \
0 n0 alouatta_palliata\t mantled_howler
1 n1 erythrocebus_patas\t patas_monkey
2 n2 cacajao_calvus\t bald_uakari
3 n3 macaca_fuscata\t japanese_macaque
4 n4 cebuella_pygmea\t pygmy_marmoset
5 n5 cebus_capucinus\t white_headed_capuchin
6 n6 mico_argentatus\t silvery_marmoset
7 n7 saimiri_sciureus\t common_squirrel_monkey
8 n8 aotus_nigriceps\t black_headed_night_monkey
9 n9 trachypithecus_johnii nilgiri_langur
Train Images Validation Images
0 131 26
1 139 28
2 137 27
3 152 30
4 131 26
5 141 28
6 132 26
7 142 28
8 133 27
9 132 26
"""
数据处理并生成generator:
height = 64
width = 64
channels = 3
batch_size = 12
num_classes = 10
#设置ImageDataGenerator(),即该函数如何对数据进行处理
train_data_generator = keras.preprocessing.image.ImageDataGenerator(
rescale = 1. / 255,
rotation_range = 40,#随机旋转
width_shift_range = 0.2,#水平位移的比例
height_shift_range = 0.2,#垂直位移的比例
shear_range = 0.2,#随机剪切扭曲的比例
zoom_range = 0.2,#随机缩放的比例
horizontal_flip = True,#设置水平翻转
fill_mode = "nearest")#数据变换后,如果与数据设置边界大小不符时的灰度值填充方法
#通过已设置好的ImageDataGenerator()来读取并处理图片数据
train_generator = train_data_generator.flow_from_directory(
train_dir,#各类图片的根文件路径
target_size = (height,width),#生成图片的大小
batch_size = batch_size,
seed = 42,
shuffle = True,
class_mode = "categorical")
valid_data_generator = keras.preprocessing.image.ImageDataGenerator(rescale = 1. / 255)#用于验证,不用进行数据增强
valid_generator = valid_data_generator.flow_from_directory(
valid_dir,
target_size = (height,width),
batch_size = batch_size,
seed = 42,
shuffle = True,
class_mode = "categorical")
num_train = train_generator.samples
num_valid = valid_generator.samples
"""
输出:
Found 1098 images belonging to 10 classes.
Found 272 images belonging to 10 classes.
"""
生成generator返回的是一个DataFrameIterator,通过一个元组(x,y)来接收,x是以一个numpy array,包含了一个batch的数据,y是一个numpy array,对应于x的标签数据:
for i in range(1):
x,y = train_generator.next()
print(x.shape,y.shape)
print(y)
"""
输出:
(12, 64, 64, 3) (12, 10)
[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]]
"""
编译
model_1 = my_resnet(input_shape=(64,64,3),classes=10)
model_1.compile(loss = "categorical_crossentropy",
optimizer = "adam",
metrics = ['accuracy'])
model_1.summary()
训练:
logdir = os.path.join("my_resnet")
if not os.path.exists(logdir):
os.mkdir(logdir)
callback = [
keras.callbacks.TensorBoard(logdir),
]
his_1 = model_1.fit_generator(train_generator,
steps_per_epoch = num_train // batch_size,
epochs = 300,
validation_data = valid_generator,
validation_steps = num_valid // batch_size)
emm,结果不太好,由于太费时间,不再继续训练。下面采用已经训练好的ResNet50的参数进行训练。
2.2 迁移学习
即将已经训练好的ResNet网络及其参数用于这个数据集的分类,在这个基础上进行微调,一般有两种做法,①保留卷积层所有参数,修改顶层的全连接层,只训练全连接层的参数;②另一个是训练卷积层后几层。
(1)仅修改全连接层
generator:采用ImageDataGenerator,数据处理采用resnet50.preprocess_input来进行预处理
height = 224
width = 224
channels = 3
batch_size = 24
num_classes = 10
#设置ImageDataGenerator(),即该函数如何对数据进行处理
train_data_generator = keras.preprocessing.image.ImageDataGenerator(
preprocessing_function = keras.applications.resnet50.preprocess_input,
rotation_range = 40,#随机旋转
width_shift_range = 0.2,#水平位移的比例
height_shift_range = 0.2,#垂直位移的比例
shear_range = 0.2,#随机剪切扭曲的比例
zoom_range = 0.2,#随机缩放的比例
horizontal_flip = True,#设置水平翻转
fill_mode = "nearest")#数据变换后,如果与数据设置边界大小不符时的灰度值填充方法
train_generator = train_data_generator.flow_from_directory(
train_dir,#各类图片的根文件路径
target_size = (height,width),#生成图片的大小
batch_size = batch_size,
seed = 42,
shuffle = True,
class_mode = "categorical")
valid_data_generator = keras.preprocessing.image.ImageDataGenerator(preprocessing_function = keras.applications.resnet50.preprocess_input)
valid_generator = valid_data_generator.flow_from_directory(
valid_dir,
target_size = (height,width),
batch_size = batch_size,
seed = 42,
shuffle = True,
class_mode = "categorical")
num_train = train_generator.samples
num_valid = valid_generator.samples
模型修改:①添加ResNet时,将其作为一个层来输入,参数控制其不保留顶层全连接层;②编译前,设定这一整个层的参数不可训练。
resnet50_fine_tune = keras.models.Sequential([
keras.applications.ResNet50(include_top = False,#是否保留顶层的全连接网络
pooling = 'avg',#池化方式
weights = 'imagenet'),#若设置为None则不采用原resnet50的参数
keras.layers.Dense(num_classes,activation="softmax")#在原网络卷积层的基础上添加一个全连接层来作为输出
])
resnet50_fine_tune.layers[0].trainable = False #设置原网络中所有参数不参与训练,直接使用”imagenet“中参数
resnet50_fine_tune.compile(loss = "categorical_crossentropy",
optimizer = "sgd",
metrics = ['accuracy'])
resnet50_fine_tune.summary()
his_2 = resnet50_fine_tune.fit_generator(train_generator,
steps_per_epoch = num_train // batch_size,
validation_data = valid_generator,
validation_steps = num_valid // batch_size,
epochs = 10)
采用已训练好的网络,仅迭代10次,就可以获得不错的结果:
(2)ResNet50卷积层后几层参数重新训练
resnet50_conv = keras.applications.ResNet50(include_top = False,pooling = "avg",weights = "imagenet")
#将resnet50卷积层倒数第5层以前的参数都不可训练,即后5层可训练
for layer in resnet50_conv.layers[0:-5]:
layer.trainable = False
resnet50_fine_tune_2 = keras.models.Sequential([
resnet50_conv,
keras.layers.Dense(num_classes,activation = "softmax"),
])
resnet50_fine_tune_2.compile(loss = "categorical_crossentropy",
optimizer = "sgd",
metrics = ['accuracy'])
resnet50_fine_tune_2.summary()
emm,又又又报错了,显存不够,但用CPU可以,但太慢了,不训练了,over。