> 文章列表 > nn.Conv1d简单理解

nn.Conv1d简单理解

nn.Conv1d简单理解

1. 官方文档的定义

In the simplest case, the output value of the layer with input size
(N,Cin,L)(N, C_{\\text{in}}, L)(N,Cin,L) and output (N,Cout,Lout)(N, C_{\\text{out}}, L_{\\text{out}})(N,Cout,Lout) can be precisely described as:
out(Ni,Coutj)=bias(Coutj)+∑k=0Cin−1weight(Coutj,k)⋆input(Ni,k)\\text{out}(N_i, C_{\\text{out}_j}) = \\text{bias}(C_{\\text{out}_j}) + \\sum_{k = 0}^{C_{in} - 1} \\text{weight}(C_{\\text{out}_j}, k) \\star \\text{input}(N_i, k) out(Ni,Coutj)=bias(Coutj)+k=0Cin1weight(Coutj,k)input(Ni,k)
where ⋆\\star is the valid cross-correlation_ operator,
N is a batch size, C denotes a number of channels,
L is a length of signal sequence.
$$

This module supports :ref:`TensorFloat32<tf32_on_ampere>`.* :attr:`stride` controls the stride for the cross-correlation, a singlenumber or a one-element tuple.* :attr:`padding` controls the amount of implicit zero-paddings on both sidesfor :attr:`padding` number of points.* :attr:`dilation` controls the spacing between the kernel points; alsoknown as the à trous algorithm. It is harder to describe, but this `link`_has a nice visualization of what :attr:`dilation` does.* :attr:`groups` controls the connections between inputs and outputs.:attr:`in_channels` and :attr:`out_channels` must both be divisible by:attr:`groups`. For example,* At groups=1, all inputs are convolved to all outputs.* At groups=2, the operation becomes equivalent to having two convlayers side by side, each seeing half the input channels,and producing half the output channels, and both subsequentlyconcatenated.* At groups= :attr:`in_channels`, each input channel is convolved withits own set of filters,of size  

⌊out_channelsin_channels⌋\\left\\lfloor\\frac{out\\_channels}{in\\_channels}\\right\\rfloorin_channelsout_channels

1.1 参数解释

  • Input: (N,Cin,Lin)(N, C_{in}, L_{in})(N,Cin,Lin)
  • Output: (N,Cout,Lout)(N, C_{out}, L_{out})(N,Cout,Lout) where

其中如上文所述:

  • N 代表 batch size,

  • C 代表channels, 通道的数量。 在序列中,代表每个列向量维度。 
    CinC_{in}Cin 输入序列中,每个列向量的编码维度。 
    CoutC_{out}Cout 输出序列中,期待每个列向量的编码维度。

  • L 代表 sequence 序列的长度,即序列中有多少个列向量。
    LinL_{in}Lin 输入序列中, 包含多少个列向量。
    LoutL_{out}Lout 输出序列中,包含多少个列向量。

输出如下所示:

Lout=⌊Lin+2×padding−dilation×(kernel_size−1)−1stride+1⌋L_{out} = \\left\\lfloor\\frac{L_{in} + 2 \\times \\text{padding} - \\text{dilation} \\times (\\text{kernel\\_size} - 1) - 1}{\\text{stride}} + 1\\right\\rfloor Lout=strideLin+2×paddingdilation×(kernel_size1)1+1

1.2 运行举例

其中padding 默认0,  dilation 默认1, groups 默认1,

计算公式,按照上文计算。

import torch.nn as nnm = nn.Conv1d(16,33, 3, stride =2)
input = torch.rand(20, 16, 50)output = m(input)print(output.shape)
torch.Size([20, 33, 24])