知识点回顾:
彩色和灰度图片测试和训练的规范写法:封装在函数中
展平操作:除第一个维度batchsize外全部展平
dropout操作:训练阶段随机丢弃神经元,测试阶段eval模式关闭dropout
作业:仔细学习下测试和训练代码的逻辑,这是基础,这个代码框架后续会一直沿用,后续的重点慢慢就是转向模型定义阶段了。
在深度学习项目中,规范的代码结构能极大提升开发效率与代码可维护性。本文将基于 PyTorch 框架,详细讲解图像数据训练和测试的规范写法,从单通道图像到彩色图像,助你构建高效、清晰的模型训练流程。
我们以 MNIST 手写数字数据集为例,其为单通道灰度图像。数据预处理是模型训练的起点,我们利用 torchvision.transforms
对图像进行转换:
transform = transforms.Compose([
transforms.ToTensor(), # 转换为张量并归一化到[0,1]
transforms.Normalize((0.1307,), (0.3081,)) # 使用 MNIST 数据集的均值和标准差进行标准化
])
接着加载数据集并创建数据加载器:
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, transform=transform)
batch_size = 64
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
针对 MNIST 图像尺寸(28×28),定义一个多层感知机(MLP)模型:
class MLP(nn.Module):
def __init__(self):
super(MLP, self).__init__()
self.flatten = nn.Flatten() # 将 28x28 图像展平为 784 维向量
self.layer1 = nn.Linear(784, 128)
self.relu = nn.ReLU()
self.layer2 = nn.Linear(128, 10)
def forward(self, x):
x = self.flatten(x)
x = self.layer1(x)
x = self.relu(x)
x = self.layer2(x)
return x
为提升代码复用性与可读性,我们将训练和测试逻辑封装为函数:
def train(model, train_loader, test_loader, criterion, optimizer, device, epochs):
model.train()
all_iter_losses = []
iter_indices = []
for epoch in range(epochs):
running_loss = 0.0
correct = 0
total = 0
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
iter_loss = loss.item()
all_iter_losses.append(iter_loss)
iter_indices.append(epoch * len(train_loader) + batch_idx + 1)
running_loss += iter_loss
_, predicted = output.max(1)
total += target.size(0)
correct += predicted.eq(target).sum().item()
if (batch_idx + 1) % 100 == 0:
print(f'Epoch: {epoch+1}/{epochs} | Batch: {batch_idx+1}/{len(train_loader)} '
f'| 单 Batch 损失: {iter_loss:.4f} | 累计平均损失: {running_loss/(batch_idx+1):.4f}')
epoch_train_loss = running_loss / len(train_loader)
epoch_train_acc = 100. * correct / total
epoch_test_loss, epoch_test_acc = test(model, test_loader, criterion, device)
print(f'Epoch {epoch+1}/{epochs} 完成 | 训练准确率: {epoch_train_acc:.2f}% | 测试准确率: {epoch_test_acc:.2f}%')
plot_iter_losses(all_iter_losses, iter_indices)
return epoch_test_acc
def test(model, test_loader, criterion, device):
model.eval()
test_loss = 0
correct = 0
total = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
test_loss += criterion(output, target).item()
_, predicted = output.max(1)
total += target.size(0)
correct += predicted.eq(target).sum().item()
avg_loss = test_loss / len(test_loader)
accuracy = 100. * correct / total
return avg_loss, accuracy
设置训练轮次并启动训练:
epochs = 2
print("开始训练模型...")
final_accuracy = train(model, train_loader, test_loader, criterion, optimizer, device, epochs)
print(f"训练完成!最终测试准确率: {final_accuracy:.2f}%")
对于彩色图像(如 CIFAR-10 数据集),处理流程与单通道图像类似,主要差异在于:
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) # 适应彩色图像的标准化
])
class MLP(nn.Module):
def __init__(self):
super(MLP, self).__init__()
self.flatten = nn.Flatten() # 将 3x32x32 图像展平为 3072 维向量
self.layer1 = nn.Linear(3072, 512)
self.relu1 = nn.ReLU()
self.dropout1 = nn.Dropout(0.2)
self.layer2 = nn.Linear(512, 256)
self.relu2 = nn.ReLU()
self.dropout2 = nn.Dropout(0.2)
self.layer3 = nn.Linear(256, 10)
def forward(self, x):
x = self.flatten(x)
x = self.layer1(x)
x = self.relu1(x)
x = self.dropout1(x)
x = self.layer2(x)
x = self.relu2(x)
x = self.dropout2(x)
x = self.layer3(x)
return x
DataLoader
和 Dataset
对数据进行分批次处理,提高数据加载效率。通过遵循上述规范写法,无论是单通道还是彩色图像数据,都能高效地完成模型训练与测试任务,在实际项目中可根据需求灵活扩展与优化。
@浙大疏锦行