宝子们,在深度学习的神秘世界里,咱们就像一群“炼丹师”,而模型就是咱们精心炼制的“丹药”,数据集则是炼丹的“原材料”。今天,咱们就用经典的LeNet卷积神经网络模型,在MNIST手写数字数据集这个“原材料宝库”里,炼制出一颗能精准识别数字的“神奇丹药”!
LeNet网络结构回顾,见:深度学习图像分类六大经典网络结构全解析
MNIST数据集可是深度学习界的“老牌明星原料”,它包含60000张训练图片和10000张测试图片,每张图片都是28x28像素大小的手写数字,从0到9一应俱全。这些图片就像是一群调皮的“数字小精灵”,等着咱们用LeNet这个“炼丹炉”把它们炼化成有用的“数字丹药”。
图:MNIST数据集部分手写数字示例
LeNet模型可是卷积神经网络的“开山鼻祖”之一,结构简单却功能强大,就像一个精心设计的“炼丹炉”,能把输入的“数字小精灵”转化为精准的识别结果。它主要由卷积层、池化层和全连接层组成,下面咱们就来详细看看这个“炼丹炉”的结构。
下面就是咱们用PyTorch框架实现LeNet模型,在MNIST数据集上“炼丹”的完整代码啦!
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
# 定义LeNet模型
class LeNet(nn.Module):
def __init__(self):
super(LeNet, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5) # 卷积层C1
self.pool = nn.MaxPool2d(2, 2) # 池化层S2和S4
self.conv2 = nn.Conv2d(6, 16, 5) # 卷积层C3
self.fc1 = nn.Linear(16 * 4 * 4, 120) # 全连接层C5
self.fc2 = nn.Linear(120, 84) # 全连接层F6
self.fc3 = nn.Linear(84, 10) # 输出层
def forward(self, x):
x = self.pool(torch.relu(self.conv1(x))) # C1和S2
x = self.pool(torch.relu(self.conv2(x))) # C3和S4
x = x.view(-1, 16 * 4 * 4) # 展平
x = torch.relu(self.fc1(x)) # C5
x = torch.relu(self.fc2(x)) # F6
x = self.fc3(x) # 输出层
return x
# 数据预处理和加载
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False)
# 初始化模型、损失函数和优化器
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = LeNet().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# 训练模型
num_epochs = 10
train_losses = []
train_accuracies = []
for epoch in range(num_epochs):
running_loss = 0.0
correct = 0
total = 0
for images, labels in train_loader:
images, labels = images.to(device), labels.to(device)
# 前向传播
outputs = model(images)
loss = criterion(outputs, labels)
# 反向传播和优化
optimizer.zero_grad()
loss.backward()
optimizer.step()
running_loss += loss.item()
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
epoch_loss = running_loss / len(train_loader)
epoch_accuracy = 100 * correct / total
train_losses.append(epoch_loss)
train_accuracies.append(epoch_accuracy)
print(f'Epoch {epoch + 1}/{num_epochs}, Loss: {epoch_loss:.4f}, Accuracy: {epoch_accuracy:.2f}%')
# 绘制训练损失和准确率曲线
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(train_losses, label='Training Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss Curve')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(train_accuracies, label='Training Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy (%)')
plt.title('Training Accuracy Curve')
plt.legend()
plt.tight_layout()
plt.show()
# 测试模型
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in test_loader:
images, labels = images.to(device), labels.to(device)
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
test_accuracy = 100 * correct / total
print(f'Test Accuracy: {test_accuracy:.2f}%')
代码解释
LeNet
类,实现了前面剖析的LeNet模型结构。torchvision
加载MNIST数据集,并进行数据预处理。运行上面的代码,咱们就能看到“炼丹”的过程和结果啦!在训练过程中,你会看到类似下面的输出:
Epoch 1/10, Loss: 00.2876, Accuracy: 91.52%
Epoch 2/10, Loss: 0.0752, Accuracy: 970.68%
...
Epoch 10/10, Loss: 0.0234, Accuracy: 99.21%
Test Accuracy: 99.17%
从结果可以看出,经过10个epoch的训练,模型在训练集上的准确率达到了99.21%,在测试集上的准确率也达到了99.17%,这说明咱们的“炼丹”大获成功,LeNet模型成功地把MNIST数据集里的“数字小精灵”炼化成了能精准识别数字的“神奇丹药”!
宝子们,今天咱们用LeNet模型在MNIST数据集上“炼丹”的过程是不是超有趣?通过这个实战,咱们不仅掌握了LeNet模型的结构和实现,还学会了如何用PyTorch框架进行模型训练和评估。深度学习就像一场奇妙的“炼丹之旅”,每一次尝试都可能带来意想不到的收获。希望你们也能在这个充满挑战和惊喜的世界里,不断探索,炼制出更多更强大的“神奇丹药”!
好啦,今天的“炼丹”分享就到这里啦,咱们下次再见!