构建子模块就是从torch.nn
中找到想要的layer
,最终通过前向传播(即forward()
函数) 把这些layer
拼接在一起。
有了 Pytorch,搭建神经网络模型就跟搭积木一样。
每个网络层(无论是用户自定义的,还是 Pytorch 内置的)都要继承nn.Module
类,这样是为了规范化每个网络层,更方便管理。
上官方文档:
class Module(object):
r"""Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in
a tree structure.
"""
dump_patches = False
_version = 1
def __init__(self):
"""
Initializes internal Module state, shared by both nn.Module and ScriptModule.
"""
torch._C._log_api_usage_once("python.nn_module")
self.training = True
self._parameters = OrderedDict()
self._buffers = OrderedDict()
self._backward_hooks = OrderedDict()
self._forward_hooks = OrderedDict()
self._forward_pre_hooks = OrderedDict()
self._state_dict_hooks = OrderedDict()
self._load_state_dict_pre_hooks = OrderedDict()
self._modules = OrderedDict()
def forward(self, *input):
r"""Defines the computation performed at every call.
Should be overridden by all subclasses.
note:
Although the recipe for forward pass needs to be defined within
this function, one should call the :class:`Module` instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
"""
raise NotImplementedError
可以看到,nn.Module 共有 8 个参数(8 个有序字典)
一个构建实例:
class LeNet(nn.Module):
# 在这里构建子模块
def __init__(self, classes):
# 初始化 nn.Module的 8个参数
super(LeNet, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16*5*5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, classes)
def forward(self, x):
out = F.relu(self.conv1(x))
out = F.max_pool2d(out, 2)
out = F.relu(self.conv2(out))
out = F.max_pool2d(out, 2)
out = out.view(out.size(0), -1)
out = F.relu(self.fc1(out))
out = F.relu(self.fc2(out))
out = self.fc3(out)
return out
在实例化 LeNet 类时,在初始化方法中,在初始化父类 nn.Module 的 8 个参数后,每一个赋值操作,如self.conv1 = nn.Conv2d(3, 6, 5)
,都会被拦截下来,进行类型判断,如果是 parameter 则添加进self._parameters
,如果是 module 则添加进self._modules
。
上述几点可通过 Debug 调式验证。
nn.Module 总结
参考:https://blog.csdn.net/oldmao_2001/article/details/102787546
注:这里讲的网络层,全称是神经网络层。
nn.Sequential 是 nn.module 的容器,用于按顺序包装一组网络层,如下图的LeNet,可以用 2 个 Sequential 把 LeNet 包装起来。(见白色虚线框)
当然也可以只用一个 Sequential,但不直观。
代码如下
class LeNetSequential(nn.Module):
def __init__(self, classes):
super(LeNetSequential, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(3, 6, 5),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(6, 16, 5),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),)
self.classifier = nn.Sequential(
nn.Linear(16*5*5, 120),
nn.ReLU(),
nn.Linear(120, 84),
nn.ReLU(),
nn.Linear(84, classes),)
def forward(self, x):
x = self.features(x)
x = x.view(x.size()[0], -1)
x = self.classifier(x)
return x
上面定义的各个层的命名都是序号,在多个层的时候不是很方便调试或者阅读,可以在创建 Sequential 的时候用 OrderedDict 指定层的名字,具体做法如下:
# 写法二(更直观)
class LeNetSequentialOrderDict(nn.Module):
def __init__(self, classes):
super(LeNetSequentialOrderDict, self).__init__()
self.features = nn.Sequential(OrderedDict({
'conv1': nn.Conv2d(3, 6, 5),
'relu1': nn.ReLU(inplace=True),
'pool1': nn.MaxPool2d(kernel_size=2, stride=2),
'conv2': nn.Conv2d(6, 16, 5),
'relu2': nn.ReLU(inplace=True),
'pool2': nn.MaxPool2d(kernel_size=2, stride=2),
}))
self.classifier = nn.Sequential(OrderedDict({
'fc1': nn.Linear(16*5*5, 120),
'relu3': nn.ReLU(),
'fc2': nn.Linear(120, 84),
'relu4': nn.ReLU(inplace=True),
'fc3': nn.Linear(84, classes),
}))
def forward(self, x):
x = self.features(x)
x = x.view(x.size()[0], -1)
x = self.classifier(x)
return x
小结:
nn.ModuleList 是 nn.module 的容器,用于包装一组网络层,以迭代方式调用网络层
主要方法:
下面例子是创建一个20层的 FC 网络模型
class ModuleList(nn.Module):
def __init__(self):
super(ModuleList, self).__init__()
self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(20)])
def forward(self, x):
for i, linear in enumerate(self.linears):
x = linear(x)
return x
nn.ModuleDict 是 nn.module 的容器,用于包装一组网络层,以索引方式调用网络层主要方法:
class ModuleDict(nn.Module):
def __init__(self):
super(ModuleDict, self).__init__()
self.choices = nn.ModuleDict({
'conv': nn.Conv2d(10, 10, 3),
'pool': nn.MaxPool2d(3)
})
self.activations = nn.ModuleDict({
'relu': nn.ReLU(),
'prelu': nn.PReLU()
})
def forward(self, x, choice, act):
x = self.choices[choice](x)
x = self.activations[act](x)
return x
net = ModuleDict()
fake_img = torch.randn((4, 10, 32, 32))
output = net(fake_img, 'conv', 'relu')
print(output)