Torch.nn

These are the basic building blocks for graphs:

Containers

torch.nn.Module	Base class for all neural network modules.
torch.nn.Sequential	A sequential container.
torch.nn.ModuleList	Holds submodules in a list.
torch.nn.ModuleDict	Holds submodules in a dictionary.
torch.nn.ParameterList	Holds parameters in a list.
torch.nn.ParameterDict	Holds parameters in a dictionary.

Global Hooks For Module

torch.nn.register_module_forward_pre_hook	Register a forward pre-hook common to all modules.
torch.nn.register_module_forward_hook	Register a global forward hook for all the modules.
torch.nn.register_module_backward_hook	Register a backward hook common to all the modules.
torch.nn.register_module_full_backward_pre_hook	Register a backward pre-hook common to all the modules.
torch.nn.register_module_full_backward_hook	Register a backward hook common to all the modules.
torch.nn.register_module_buffer_registration_hook	Register a buffer registration hook common to all modules.
torch.nn.register_module_module_registration_hook	Register a module registration hook common to all modules.
torch.nn.register_module_parameter_registration_hook	Register a parameter registration hook common to all modules.

Convolution Layers

torch.nn.Conv1d	Applies a 1D convolution over an input signal composed of several input planes.
torch.nn.Conv2d	Applies a 2D convolution over an input signal composed of several input planes.
torch.nn.Conv3d	Applies a 3D convolution over an input signal composed of several input planes.
torch.nn.ConvTranspose1d	Applies a 1D transposed convolution operator over an input image composed of several input planes.
torch.nn.ConvTranspose2d	Applies a 2D transposed convolution operator over an input image composed of several input planes.
torch.nn.ConvTranspose3d	Applies a 3D transposed convolution operator over an input image composed of several input planes.
torch.nn.LazyConv1d	A torch.nn.Conv1d module with lazy initialization of the in_channels argument.
torch.nn.LazyConv2d	A torch.nn.Conv2d module with lazy initialization of the in_channels argument.
torch.nn.LazyConv3d	A torch.nn.Conv3d module with lazy initialization of the in_channels argument.
torch.nn.LazyConvTranspose1d	A torch.nn.ConvTranspose1d module with lazy initialization of the in_channels argument.
torch.nn.LazyConvTranspose2d	A torch.nn.ConvTranspose2d module with lazy initialization of the in_channels argument.
torch.nn.LazyConvTranspose3d	A torch.nn.ConvTranspose3d module with lazy initialization of the in_channels argument.
torch.nn.Unfold	Extracts sliding local blocks from a batched input tensor.
torch.nn.Fold	Combines an array of sliding local blocks into a large containing tensor.

Pooling layers

torch.nn.MaxPool1d	Applies a 1D max pooling over an input signal composed of several input planes.
torch.nn.MaxPool2d	Applies a 2D max pooling over an input signal composed of several input planes.
torch.nn.MaxPool3d	Applies a 3D max pooling over an input signal composed of several input planes.
torch.nn.MaxUnpool1d	Computes a partial inverse of MaxPool1d.
torch.nn.MaxUnpool2d	Computes a partial inverse of MaxPool2d.
torch.nn.MaxUnpool3d	Computes a partial inverse of MaxPool3d.
torch.nn.AvgPool1d	Applies a 1D average pooling over an input signal composed of several input planes.
torch.nn.AvgPool2d	Applies a 2D average pooling over an input signal composed of several input planes.
torch.nn.AvgPool3d	Applies a 3D average pooling over an input signal composed of several input planes.
torch.nn.FractionalMaxPool2d	Applies a 2D fractional max pooling over an input signal composed of several input planes.
torch.nn.FractionalMaxPool3d	Applies a 3D fractional max pooling over an input signal composed of several input planes.
torch.nn.LPPool1d	Applies a 1D power-average pooling over an input signal composed of several input planes.
torch.nn.LPPool2d	Applies a 2D power-average pooling over an input signal composed of several input planes.
torch.nn.LPPool3d	Applies a 3D power-average pooling over an input signal composed of several input planes.
torch.nn.AdaptiveMaxPool1d	Applies a 1D adaptive max pooling over an input signal composed of several input planes.
torch.nn.AdaptiveMaxPool2d	Applies a 2D adaptive max pooling over an input signal composed of several input planes.
torch.nn.AdaptiveMaxPool3d	Applies a 3D adaptive max pooling over an input signal composed of several input planes.
torch.nn.AdaptiveAvgPool1d	Applies a 1D adaptive average pooling over an input signal composed of several input planes.
torch.nn.AdaptiveAvgPool2d	Applies a 2D adaptive average pooling over an input signal composed of several input planes.
torch.nn.AdaptiveAvgPool3d	Applies a 3D adaptive average pooling over an input signal composed of several input planes.

Padding Layers

torch.nn.ReflectionPad1d	Pads the input tensor using the reflection of the input boundary.
torch.nn.ReflectionPad2d	Pads the input tensor using the reflection of the input boundary.
torch.nn.ReflectionPad3d	Pads the input tensor using the reflection of the input boundary.
torch.nn.ReplicationPad1d	Pads the input tensor using replication of the input boundary.
torch.nn.ReplicationPad2d	Pads the input tensor using replication of the input boundary.
torch.nn.ReplicationPad3d	Pads the input tensor using replication of the input boundary.
torch.nn.ZeroPad1d	Pads the input tensor boundaries with zero.
torch.nn.ZeroPad2d	Pads the input tensor boundaries with zero.
torch.nn.ZeroPad3d	Pads the input tensor boundaries with zero.
torch.nn.ConstantPad1d	Pads the input tensor boundaries with a constant value.
torch.nn.ConstantPad2d	Pads the input tensor boundaries with a constant value.
torch.nn.ConstantPad3d	Pads the input tensor boundaries with a constant value.
torch.nn.CircularPad1d	Pads the input tensor using circular padding of the input boundary.
torch.nn.CircularPad2d	Pads the input tensor using circular padding of the input boundary.
torch.nn.CircularPad3d	Pads the input tensor using circular padding of the input boundary.

Non-linear Activations (weighted sum, nonlinearity)

torch.nn.ELU	Applies the Exponential Linear Unit (ELU) function, element-wise.
torch.nn.Hardshrink	Applies the Hard Shrinkage (Hardshrink) function element-wise.
torch.nn.Hardsigmoid	Applies the Hardsigmoid function element-wise.
torch.nn.Hardtanh	Applies the HardTanh function element-wise.
torch.nn.Hardswish	Applies the Hardswish function, element-wise.
torch.nn.LeakyReLU	Applies the LeakyReLU function element-wise.
torch.nn.LogSigmoid	Applies the Logsigmoid function element-wise.
torch.nn.MultiheadAttention	Allows the model to jointly attend to information from different representation subspaces.
torch.nn.PReLU	Applies the element-wise PReLU function.
torch.nn.ReLU	Applies the rectified linear unit function element-wise.
torch.nn.ReLU6	Applies the ReLU6 function element-wise.
torch.nn.RReLU	Applies the randomized leaky rectified linear unit function, element-wise.
torch.nn.SELU	Applies the SELU function element-wise.
torch.nn.CELU	Applies the CELU function element-wise.
torch.nn.GELU	Applies the Gaussian Error Linear Units function.
torch.nn.Sigmoid	Applies the Sigmoid function element-wise.
torch.nn.SiLU	Applies the Sigmoid Linear Unit (SiLU) function, element-wise.
torch.nn.Mish	Applies the Mish function, element-wise.
torch.nn.Softplus	Applies the Softplus function element-wise.
torch.nn.Softshrink	Applies the soft shrinkage function element-wise.
torch.nn.Softsign	Applies the element-wise Softsign function.
torch.nn.Tanh	Applies the Hyperbolic Tangent (Tanh) function element-wise.
torch.nn.Tanhshrink	Applies the element-wise Tanhshrink function.
torch.nn.Threshold	Thresholds each element of the input Tensor.
torch.nn.GLU	Applies the gated linear unit function.

Non-linear Activations (other)

torch.nn.Softmin	Applies the Softmin function to an n-dimensional input Tensor.
torch.nn.Softmax	Applies the Softmax function to an n-dimensional input Tensor.
torch.nn.Softmax2d	Applies SoftMax over features to each spatial location.
torch.nn.LogSoftmax	Applies the log⁡(Softmax(x))log(Softmax(x)) function to an n-dimensional input Tensor.
torch.nn.AdaptiveLogSoftmaxWithLoss	Efficient softmax approximation.

Normalization Layers

torch.nn.BatchNorm1d	Applies Batch Normalization over a 2D or 3D input.
torch.nn.BatchNorm2d	Applies Batch Normalization over a 4D input.
torch.nn.BatchNorm3d	Applies Batch Normalization over a 5D input.
torch.nn.LazyBatchNorm1d	A torch.nn.BatchNorm1d module with lazy initialization.
torch.nn.LazyBatchNorm2d	A torch.nn.BatchNorm2d module with lazy initialization.
torch.nn.LazyBatchNorm3d	A torch.nn.BatchNorm3d module with lazy initialization.
torch.nn.GroupNorm	Applies Group Normalization over a mini-batch of inputs.
torch.nn.SyncBatchNorm	Applies Batch Normalization over a N-Dimensional input.
torch.nn.InstanceNorm1d	Applies Instance Normalization.
torch.nn.InstanceNorm2d	Applies Instance Normalization.
torch.nn.InstanceNorm3d	Applies Instance Normalization.
torch.nn.LazyInstanceNorm1d	A torch.nn.InstanceNorm1d module with lazy initialization of the num_features argument.
torch.nn.LazyInstanceNorm2d	A torch.nn.InstanceNorm2d module with lazy initialization of the num_features argument.
torch.nn.LazyInstanceNorm3d	A torch.nn.InstanceNorm3d module with lazy initialization of the num_features argument.
torch.nn.LayerNorm	Applies Layer Normalization over a mini-batch of inputs.
torch.nn.LocalResponseNorm	Applies local response normalization over an input signal.

Recurrent Layers

torch.nn.RNNBase	Base class for RNN modules (RNN, LSTM, GRU).
torch.nn.RNN	Apply a multi-layer Elman RNN with tanh⁡tanh or ReLUReLU non-linearity to an input sequence.
torch.nn.LSTM	Apply a multi-layer long short-term memory (LSTM) RNN to an input sequence.
torch.nn.GRU	Apply a multi-layer gated recurrent unit (GRU) RNN to an input sequence.
torch.nn.RNNCell	An Elman RNN cell with tanh or ReLU non-linearity.
torch.nn.LSTMCell	A long short-term memory (LSTM) cell.
torch.nn.GRUCell	A gated recurrent unit (GRU) cell.

Transformer Layers

torch.nn.Transformer	A transformer model.
torch.nn.TransformerEncoder	TransformerEncoder is a stack of N encoder layers.
torch.nn.TransformerDecoder	TransformerDecoder is a stack of N decoder layers.
torch.nn.TransformerEncoderLayer	TransformerEncoderLayer is made up of self-attn and feedforward network.
torch.nn.TransformerDecoderLayer	TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network.

Linear Layers

torch.nn.Identity	A placeholder identity operator that is argument-insensitive.
torch.nn.Linear	Applies a linear transformation to the incoming data: \(y=xA^{T}+b\)
torch.nn.Bilinear	Applies a bilinear transformation to the incoming data: \(y=x^{x}{1}Ax_{2}+b\)
torch.nn.LazyLinear	A torch.nn.Linear module where in_features is inferred.

Dropout Layers

torch.nn.Dropout	During training, randomly zeroes some of the elements of the input tensor with probability p.
torch.nn.Dropout1d	Randomly zero out entire channels.
torch.nn.Dropout2d	Randomly zero out entire channels.
torch.nn.Dropout3d	Randomly zero out entire channels.
torch.nn.AlphaDropout	Applies Alpha Dropout over the input.
torch.nn.FeatureAlphaDropout	Randomly masks out entire channels.

Sparse Layers

torch.nn.Embedding	A simple lookup table that stores embeddings of a fixed dictionary and size.
torch.nn.EmbeddingBag	Compute sums or means of 'bags' of embeddings, without instantiating the intermediate embeddings.

Distance Functions

torch.nn.CosineSimilarity	Returns cosine similarity between x1x1 and x2x2, computed along dim.
torch.nn.PairwiseDistance	Computes the pairwise distance between input vectors, or between columns of input matrices.

Loss Functions

torch.nn.L1Loss	Creates a criterion that measures the mean absolute error (MAE) between each element in the input xx and target yy.
torch.nn.MSELoss	Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input xx and target yy.
torch.nn.CrossEntropyLoss	This criterion computes the cross entropy loss between input logits and target.
torch.nn.CTCLoss	The Connectionist Temporal Classification loss.
torch.nn.NLLLoss	The negative log likelihood loss.
torch.nn.PoissonNLLLoss	Negative log likelihood loss with Poisson distribution of target.
torch.nn.GaussianNLLLoss	Gaussian negative log likelihood loss.
torch.nn.KLDivLoss	The Kullback-Leibler divergence loss.
torch.nn.BCELoss	Creates a criterion that measures the Binary Cross Entropy between the target and the input probabilities:
torch.nn.BCEWithLogitsLoss	This loss combines a Sigmoid layer and the BCELoss in one single class.
torch.nn.MarginRankingLoss	Creates a criterion that measures the loss given inputs x1x1, x2x2, two 1D mini-batch or 0D Tensors, and a label 1D mini-batch or 0D Tensor yy (containing 1 or -1).
torch.nn.HingeEmbeddingLoss	Measures the loss given an input tensor xx and a labels tensor yy (containing 1 or -1).
torch.nn.MultiLabelMarginLoss	Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input xx (a 2D mini-batch Tensor) and output yy (which is a 2D Tensor of target class indices).
torch.nn.HuberLoss	Creates a criterion that uses a squared term if the absolute element-wise error falls below delta and a delta-scaled L1 term otherwise.
torch.nn.SmoothL1Loss	Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise.
torch.nn.SoftMarginLoss	Creates a criterion that optimizes a two-class classification logistic loss between input tensor xx and target tensor yy (containing 1 or -1).
torch.nn.MultiLabelSoftMarginLoss	Creates a criterion that optimizes a multi-label one-versus-all loss based on max-entropy, between input xx and target yy of size (N,C)(N,C).
torch.nn.CosineEmbeddingLoss	Creates a criterion that measures the loss given input tensors x1x1, x2x2 and a Tensor label yy with values 1 or -1.
torch.nn.MultiMarginLoss	Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input xx (a 2D mini-batch Tensor) and output yy (which is a 1D tensor of target class indices, 0≤y≤x.size(1)−10≤y≤x.size(1)−1):
torch.nn.TripletMarginLoss	Creates a criterion that measures the triplet loss given an input tensors x1x1, x2x2, x3x3 and a margin with a value greater than 00.
torch.nn.TripletMarginWithDistanceLoss	Creates a criterion that measures the triplet loss given input tensors aa, pp, and nn (representing anchor, positive, and negative examples, respectively), and a nonnegative, real-valued function ("distance function") used to compute the relationship between the anchor and positive example ("positive distance") and the anchor and negative example ("negative distance").

Vision Layers

torch.nn.PixelShuffle	Rearrange elements in a tensor according to an upscaling factor.
torch.nn.PixelUnshuffle	Reverse the PixelShuffle operation.
torch.nn.Upsample	Upsamples a given multi-channel 1D (temporal), 2D (spatial) or 3D (volumetric) data.
torch.nn.UpsamplingNearest2d	Applies a 2D nearest neighbor upsampling to an input signal composed of several input channels.
torch.nn.UpsamplingBilinear2d	Applies a 2D bilinear upsampling to an input signal composed of several input channels.

Shuffle Layers

torch.nn.ChannelShuffle	Divides and rearranges the channels in a tensor.

DataParallel Layers (multi-GPU, distributed)

torch.nn.DataParallel	Implements data parallelism at the module level.
torch.nn.parallel.DistributedDataParallel	Implement distributed data parallelism based on torch.distributed at module level.

Utilities

From the torch.nn.utils module:

Utility functions to clip parameter gradients.

torch.nn.utils.clip_grad_norm_	Clip the gradient norm of an iterable of parameters.
torch.nn.utils.clip_grad_norm	Clip the gradient norm of an iterable of parameters.
torch.nn.utils.clip_grad_value_	Clip the gradients of an iterable of parameters at specified value.

Utility functions to flatten and unflatten Module parameters to and from a single vector.

torch.nn.utils.parameters_to_vector	Flatten an iterable of parameters into a single vector.
torch.nn.utils.vector_to_parameters	Copy slices of a vector into an iterable of parameters.

Utility functions to fuse Modules with BatchNorm modules.

torch.nn.utils.fuse_conv_bn_eval	Fuse a convolutional module and a BatchNorm module into a single, new convolutional module.
torch.nn.utils.fuse_conv_bn_weights	Fuse convolutional module parameters and BatchNorm module parameters into new convolutional module parameters.
torch.nn.utils.fuse_linear_bn_eval	Fuse a linear module and a BatchNorm module into a single, new linear module.
torch.nn.utils.fuse_linear_bn_weights	Fuse linear module parameters and BatchNorm module parameters into new linear module parameters.

Utility functions to convert Module parameter memory formats.

torch.nn.utils.convert_conv2d_weight_memory_format	Convert memory_format of nn.Conv2d.weight to memory_format.
torch.nn.utils.convert_conv3d_weight_memory_format	Convert memory_format of nn.Conv3d.weight to memory_format The conversion recursively applies to nested nn.Module, including module.

Utility functions to apply and remove weight normalization from Module parameters.

torch.nn.utils.weight_norm	Apply weight normalization to a parameter in the given module.
torch.nn.utils.remove_weight_norm	Remove the weight normalization reparameterization from a module.
torch.nn.utils.spectral_norm	Apply spectral normalization to a parameter in the given module.
torch.nn.utils.remove_spectral_norm	Remove the spectral normalization reparameterization from a module.

Utility functions for initializing Module parameters.

torch.nn.utils.skip_init	Given a module class object and args / kwargs, instantiate the module without initializing parameters / buffers.

Utility classes and functions for pruning Module parameters.

torch.nn.utils.prune.BasePruningMethod	Abstract base class for creation of new pruning techniques.
torch.nn.utils.prune.PruningContainer	Container holding a sequence of pruning methods for iterative pruning.
torch.nn.utils.prune.Identity	Utility pruning method that does not prune any units but generates the pruning parametrization with a mask of ones.
torch.nn.utils.prune.RandomUnstructured	Prune (currently unpruned) units in a tensor at random.
torch.nn.utils.prune.L1Unstructured	Prune (currently unpruned) units in a tensor by zeroing out the ones with the lowest L1-norm.
torch.nn.utils.prune.RandomStructured	Prune entire (currently unpruned) channels in a tensor at random.
torch.nn.utils.prune.LnStructured	Prune entire (currently unpruned) channels in a tensor based on their Ln-norm.
torch.nn.utils.prune.CustomFromMask
torch.nn.utils.prune.identity	Apply pruning reparametrization without pruning any units.
torch.nn.utils.prune.random_unstructured	Prune tensor by removing random (currently unpruned) units.
torch.nn.utils.prune.l1_unstructured	Prune tensor by removing units with the lowest L1-norm.
torch.nn.utils.prune.random_structured	Prune tensor by removing random channels along the specified dimension.
torch.nn.utils.prune.ln_structured	Prune tensor by removing channels with the lowest Ln-norm along the specified dimension.
torch.nn.utils.prune.global_unstructured	Globally prunes tensors corresponding to all parameters in parameters by applying the specified pruning_method.
torch.nn.utils.prune.custom_from_mask	Prune tensor corresponding to parameter called name in module by applying the pre-computed mask in mask.
torch.nn.utils.prune.remove	Remove the pruning reparameterization from a module and the pruning method from the forward hook.
torch.nn.utils.prune.is_pruned	Check if a module is pruned by looking for pruning pre-hooks.

Parametrizations implemented using the new parametrization functionality in torch.nn.utils.parameterize.register_parametrization().

torch.nn.utils.parametrizations.orthogonal	Apply an orthogonal or unitary parametrization to a matrix or a batch of matrices.
torch.nn.utils.parametrizations.weight_norm	Apply weight normalization to a parameter in the given module.
torch.nn.utils.parametrizations.spectral_norm	Apply spectral normalization to a parameter in the given module.

Utility functions to parametrize Tensors on existing Modules. Note that these functions can be used to parametrize a given Parameter or Buffer given a specific function that maps from an input space to the parametrized space. They are not parameterizations that would transform an object into a parameter. See the Parametrizations tutorial for more information on how to implement your own parametrizations.

torch.nn.utils.parametrize.register_parametrization	Register a parametrization to a tensor in a module.
torch.nn.utils.parametrize.remove_parametrizations	Remove the parametrizations on a tensor in a module.
torch.nn.utils.parametrize.cached	Context manager that enables the caching system within parametrizations registered with register_parametrization().
torch.nn.utils.parametrize.is_parametrized	Determine if a module has a parametrization.
torch.nn.utils.parametrize.ParametrizationList	A sequential container that holds and manages the original parameters or buffers of a parametrized torch.nn.Module.

Utility functions to call a given Module in a stateless manner.

torch.nn.utils.stateless.functional_call	Perform a functional call on the module by replacing the module parameters and buffers with the provided ones.

Utility functions in other modules

torch.nn.utils.rnn.PackedSequence	Holds the data and list of batch_sizes of a packed sequence.
torch.nn.utils.rnn.pack_padded_sequence	Packs a Tensor containing padded sequences of variable length.
torch.nn.utils.rnn.pad_packed_sequence	Pad a packed batch of variable length sequences.
torch.nn.utils.rnn.pad_sequence	Pad a list of variable length Tensors with padding_value.
torch.nn.utils.rnn.pack_sequence	Packs a list of variable length Tensors.
torch.nn.utils.rnn.unpack_sequence	Unpack PackedSequence into a list of variable length Tensors.
torch.nn.utils.rnn.unpad_sequence	Unpad padded Tensor into a list of variable length Tensors.
torch.nn.Flatten	Flattens a contiguous range of dims into a tensor.
torch.nn.Unflatten	Unflattens a tensor dim expanding it to a desired shape.

Quantized Functions

Quantization refers to techniques for performing computations and storing tensors at lower bitwidths than floating point precision. PyTorch supports both per tensor and per channel asymmetric linear quantization. To learn more how to use quantized functions in PyTorch, please refer to the Quantization documentation.

Lazy Modules Initialization

torch.nn.modules.lazy.LazyModuleMixin	A mixin for modules that lazily initialize parameters, also known as "lazy modules".

torch.nn.parameter

torch.nn.parameter.Parameter	A kind of Tensor that is to be considered a module parameter.
torch.nn.parameter.UninitializedParameter	A parameter that is not initialized.
torch.nn.parameter.UninitializedBuffer	A buffer that is not initialized.