一文搞懂全连接层

全连接层是深度学习模型的重要组成部分，尤其在卷积神经网络中起到了关键作用。通过合理设计输入和输出维度，并结合激活函数、Dropout 等技术，全连接层能够有效地完成各种任务，如分类、回归和生成等。理解和掌握全连接层的使用方法，对于构建高效的深度学习模型至关重要。

TuringEmmy

5044人浏览 · 2025-04-15 12:20:00

TuringEmmy · 2025-04-15 12:20:00 发布

全连接层（Fully Connected Layer，简称 Linear 层）是深度学习模型中非常重要的一种层类型。它通常用于将多维特征映射到一维向量空间，从而完成分类、回归等任务。在卷积神经网络（CNN）中，全连接层一般位于网络的最后部分，负责将提取到的特征进行整合并输出预测结果。

1. 全连接层的工作原理

全连接层的核心思想是：每一层的每个神经元都与上一层的所有神经元相连。通过线性变换和激活函数的组合，全连接层能够对输入数据进行非线性映射，从而实现复杂的模式学习。

数学表达式为：
$y = f (W x + b)$
其中：

( x ) 是输入向量。
( W ) 是权重矩阵，形状为 (out_features, in_features)。
( b ) 是偏置向量，形状为 (out_features)。
( f ) 是激活函数（如 ReLU、Sigmoid 等）。
( y ) 是输出向量。

2. `torch.nn.Linear` 的参数说明

在 PyTorch 中，全连接层通过 torch.nn.Linear 实现，其定义如下：

torch.nn.Linear(in_features, out_features, bias=True)

2.1. 参数详解：

2.1.1. `in_features`:

输入特征的数量，即输入向量的维度。

2.1.2. `out_features`:

输出特征的数量，即输出向量的维度。

2.1.3. `bias`（可选，默认为 `True`）:

是否包含偏置项。如果设置为 False，则不会学习偏置向量。

2.2. 输入和输出

2.2.1. 输入：

输入形状：(batch_size, in_features)

batch_size: 批次大小。
in_features: 每个样本的输入特征数量。

2.2.2. 输出：

输出形状：(batch_size, out_features)

out_features: 每个样本的输出特征数量。

3. 使用示例

以下是一些典型的全连接层使用场景和代码示例。

3.1. 示例 1：基本用法

import torch
import torch.nn as nn

# 定义一个 Linear 层
fc_layer = nn.Linear(in_features=128, out_features=64)

# 创建一个随机输入张量
input_tensor = torch.randn(4, 128)  # batch_size=4, in_features=128

# 进行前向传播
output_tensor = fc_layer(input_tensor)
print(output_tensor.shape)  # 输出形状: torch.Size([4, 64])

3.2. 示例 2：添加激活函数

全连接层通常与激活函数结合使用，以引入非线性。

import torch
import torch.nn as nn

# 定义一个 Linear 层
fc_layer = nn.Linear(in_features=128, out_features=64)

# 添加 ReLU 激活函数
activation = nn.ReLU()

# 创建一个随机输入张量
input_tensor = torch.randn(4, 128)  # batch_size=4, in_features=128

# 进行前向传播
output_tensor = activation(fc_layer(input_tensor))
print(output_tensor.shape)  # 输出形状: torch.Size([4, 64])

3.3. 示例 3：在 CNN 中使用全连接层

在卷积神经网络中，全连接层通常位于卷积层和池化层之后，用于整合特征并生成最终输出。

import torch
import torch.nn as nn

class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        # 卷积层 + 池化层
        self.conv1 = nn.Conv2d(1, 16, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        
        # 全连接层
        self.fc1 = nn.Linear(16 * 14 * 14, 128)  # 假设输入图像大小为 28x28
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)  # 输出类别数为 10

    def forward(self, x):
        # 卷积层 + 池化层
        x = self.pool(torch.relu(self.conv1(x)))
        
        # 展平操作
        x = x.view(-1, 16 * 14 * 14)
        
        # 全连接层
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# 初始化模型
model = SimpleCNN()

# 创建一个随机输入张量
input_tensor = torch.randn(4, 1, 28, 28)  # batch_size=4, channels=1, height=28, width=28

# 进行前向传播
output_tensor = model(input_tensor)
print(output_tensor.shape)  # 输出形状: torch.Size([4, 10])