1、数据的加载
MNIST数据集的加载:
In [1]: import tensorflow as tf
In [2]: (x,y),(x_test,y_test) = tf.keras.datasets.mnist.load_data()
In [3]: x.shape,y.shape
Out[3]: ((60000, 28, 28), (60000,))
In [4]: x.max(), x.min(), x.mean()
Out[4]: (255, 0, 33.318421449829934)
In [5]: y[:4]
Out[5]: array([5, 0, 4, 1], dtype=uint8)
In [6]: tf.one_hot(y[:4],depth=10)
Out[6]:
CIFAR10数据集的加载:
In [7]: (x,y),(x_test,y_test) = tf.keras.datasets.cifar10.load_data()
In [8]: x.shape,y.shape
Out[8]: ((50000, 32, 32, 3), (50000, 1))
In [9]: y[:4]
Out[9]:
array([[6],
[9],
[9],
[4]], dtype=uint8)
In [10]: db = tf.data.Dataset.from_tensor_slices(x_test)
In [11]: next(iter(db)).shape
Out[11]: TensorShape([32, 32, 3])
In [12]: next(iter(db)).shape
Out[12]: TensorShape([32, 32, 3])
2、tf.data.Dataset.from_tensor_slices
In [13]: db = tf.data.Dataset.from_tensor_slices((x_test,y_test))
In [14]: next(iter(db))[0].shape,next(iter(db))[1].shape
Out[14]: (TensorShape([32, 32, 3]), TensorShape([1]))
shuffle:通过shuffle将原始的数据的顺序进行一个打乱的操作
In [15]: db = db.shuffle(10000)
batch:使用批量训练的时候,就能通过batch来控制一批一批的数据。
In [16]: db2 = db.batch(128)
In [17]: res = next(iter(db2))
In [18]: res[0],res[1]
Out[18]:(TensorShape([128, 32, 32, 3]), TensorShape([128, 1]))
map:通过map来对数据进行一个数据预处理的操作,这里是需要写上数据预处理的逻辑的,比如对数据进行类型的变化、维度的变化、或者维度的增加和减少等操作,这联系到对Tensor的一些操作了。
3、全连接层
一个简单的全连接层,也就是一层512个节点的全连接层,net就是一个对象,它的kernel属性就是我们的权值w,bias就是我们的权值b了。这里注意,是要喂入训练的样本之后才能调用到。不然就得采用另外一种方式。
In [20]: x = tf.random.normal([10, 784])
In [21]: x.shape
Out[21]: TensorShape([10, 784])
In [22]: net = tf.keras.layers.Dense(512)
In [23]: out = net(x)
In [24]: out.shape
Out[24]: TensorShape([10, 512])
In [25]: net.kernel.shape, net.bias.shape
Out[25]: (TensorShape([784, 512]), TensorShape([512]))
多层的神经网络结构
In [3]: x = tf.random.normal([3,4])
In [4]: model = Sequential([
...: layers.Dense(3, activation='relu'),
...: layers.Dense(3, activation='relu'),
...: layers.Dense(3)
...: ])
In [5]: model.build(input_shape=[None,3])
In [6]: model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) multiple 12
_________________________________________________________________
dense_1 (Dense) multiple 12
_________________________________________________________________
dense_2 (Dense) multiple 12
=================================================================
Total params: 36
Trainable params: 36
Non-trainable params: 0
_________________________________________________________________
In [7]: for i in model.trainable_variables:
...: print(i.name, i.shape)
...:
dense/kernel:0 (3, 3)
dense/bias:0 (3,)
dense_1/kernel:0 (3, 3)
dense_1/bias:0 (3,)
dense_2/kernel:0 (3, 3)
dense_2/bias:0 (3,)
激活函数:relu、sigmoid、softmax、tanh
In [8]: a = tf.linspace(-5.,5.,10)
In [9]: a
Out[9]:
In [10]: tf.nn.relu(a)
Out[10]:
In [11]: tf.nn.sigmoid(a)
Out[11]:
In [12]: tf.nn.softmax(a)
Out[12]:
In [13]: tf.reduce_sum(tf.nn.softmax(a))
Out[13]:
In [14]: tf.nn.tanh(a)
Out[14]:
误差计算:均方误差和交叉熵损失函数
In [15]: y = tf.constant([3,2,0,1,2])
In [16]: y = tf.cast(tf.one_hot(y,depth=4),dtype=tf.float32)
In [17]: y.shape
Out[17]: TensorShape([5, 4])
In [18]: y_pred = tf.random.normal([5,4])
In [19]: loss1 = tf.reduce_mean(tf.square(y-y_pred))
In [20]: loss2 = tf.square(tf.norm(y-y_pred))/(5*4)
In [21]: loss3 = tf.reduce_mean(tf.losses.MSE(y,y_pred))
In [22]: loss1, loss2, loss3
Out[22]:
(,
,
)
In [23]: tf.losses.categorical_crossentropy(y,y_pred)
Out[23]:
In [24]: tf.reduce_mean(tf.losses.categorical_crossentropy(y,y_pred))
Out[24]:
In [25]: x = tf.random.normal([1,784])
In [26]: w = tf.random.normal([784,2])
In [27]: b = tf.zeros([2])
In [28]: logits = x@w + b
In [29]: logits
Out[29]:
In [30]: prob = tf.nn.softmax(logits,axis=1)
In [31]: prob
Out[31]:
In [32]: tf.losses.categorical_crossentropy([0,1], logits, from_logits=True)
Out[32]:
4、梯度相关
GradientTape:进行梯度的计算的时候,这里和tensorflow1.0里面的是不相同的,采用了GradientTape的方式来对相关的系数进行求梯度,然后这样的方式只能使用一次,要是在使用的话,需要添加参数persistent=True。
In [33]: with tf.GradientTape() as tape:
...: tape.watch([w,b])
...: logits = x@w + b
In [34]: grad1 = tape.gradient(logits, [w,b])
In [35]: grad1
Out[35]:
[,
]
In [36]: grad2 = tape.gradient(logits.[w,b])
RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes.
知道如何求梯度之后,我们就能够去计算损失函数的梯度,也就是我们前向的传播求损失,反向传播求梯度。
MSE:
In [36]: x = tf.random.normal([2,4])
In [37]: w = tf.random.normal([4,3])
In [38]: b= tf.zeros([3])
In [39]: y = tf.constant([2,0])
In [40]: with tf.GradientTape() as tape:
...: tape.watch([w,b])
...: prob = tf.nn.softmax(x@w, axis=1)
...: loss = tf.reduce_mean(tf.losses.MSE(tf.one_hot(y,depth=3), prob))
In [41]: grads = tape.gradient(loss,[w,b])
In [42]: grads
Out[42]:
[, None]
交叉熵:
In [43]: with tf.GradientTape() as tape:
...: tape.watch([w,b])
...: logits = x@w+b
...: loss = tf.reduce_mean(tf.losses.categorical_crossentropy(tf.one_hot(y),logits,from_logits=True))
In [44]: grads = tape.gradient(loss,[w,b])
In [45]: grads
Out[45]:
[,
]
参考资料:
1、https://study.163.com/course/courseMain.htm?courseId=1209092816&share=1&shareId=1355397
2、https://github.com/dragen1860/TensorFlow-2.x-Tutorials