Torch.nn
These are the basic building blocks for graphs:
Containers
Base class for all neural network modules. | |
torch.nn.Sequential | A sequential container. |
torch.nn.ModuleList | Holds submodules in a list. |
torch.nn.ModuleDict | Holds submodules in a dictionary. |
torch.nn.ParameterList | Holds parameters in a list. |
torch.nn.ParameterDict | Holds parameters in a dictionary. |
Global Hooks For Module
torch.nn.register_module_forward_pre_hook | Register a forward pre-hook common to all modules. |
torch.nn.register_module_forward_hook | Register a global forward hook for all the modules. |
torch.nn.register_module_backward_hook | Register a backward hook common to all the modules. |
torch.nn.register_module_full_backward_pre_hook | Register a backward pre-hook common to all the modules. |
torch.nn.register_module_full_backward_hook | Register a backward hook common to all the modules. |
torch.nn.register_module_buffer_registration_hook | Register a buffer registration hook common to all modules. |
torch.nn.register_module_module_registration_hook | Register a module registration hook common to all modules. |
torch.nn.register_module_parameter_registration_hook | Register a parameter registration hook common to all modules. |
Convolution Layers
torch.nn.Conv1d | Applies a 1D convolution over an input signal composed of several input planes. |
torch.nn.Conv2d | Applies a 2D convolution over an input signal composed of several input planes. |
torch.nn.Conv3d | Applies a 3D convolution over an input signal composed of several input planes. |
torch.nn.ConvTranspose1d | Applies a 1D transposed convolution operator over an input image composed of several input planes. |
torch.nn.ConvTranspose2d | Applies a 2D transposed convolution operator over an input image composed of several input planes. |
torch.nn.ConvTranspose3d | Applies a 3D transposed convolution operator over an input image composed of several input planes. |
torch.nn.LazyConv1d | A torch.nn.Conv1d module with lazy initialization of the in_channels argument. |
torch.nn.LazyConv2d | A torch.nn.Conv2d module with lazy initialization of the in_channels argument. |
torch.nn.LazyConv3d | A torch.nn.Conv3d module with lazy initialization of the in_channels argument. |
torch.nn.LazyConvTranspose1d | A torch.nn.ConvTranspose1d module with lazy initialization of the in_channels argument. |
torch.nn.LazyConvTranspose2d | A torch.nn.ConvTranspose2d module with lazy initialization of the in_channels argument. |
torch.nn.LazyConvTranspose3d | A torch.nn.ConvTranspose3d module with lazy initialization of the in_channels argument. |
torch.nn.Unfold | Extracts sliding local blocks from a batched input tensor. |
torch.nn.Fold | Combines an array of sliding local blocks into a large containing tensor. |
Pooling layers
torch.nn.MaxPool1d | Applies a 1D max pooling over an input signal composed of several input planes. |
torch.nn.MaxPool2d | Applies a 2D max pooling over an input signal composed of several input planes. |
torch.nn.MaxPool3d | Applies a 3D max pooling over an input signal composed of several input planes. |
torch.nn.MaxUnpool1d | Computes a partial inverse of MaxPool1d. |
torch.nn.MaxUnpool2d | Computes a partial inverse of MaxPool2d. |
torch.nn.MaxUnpool3d | Computes a partial inverse of MaxPool3d. |
torch.nn.AvgPool1d | Applies a 1D average pooling over an input signal composed of several input planes. |
torch.nn.AvgPool2d | Applies a 2D average pooling over an input signal composed of several input planes. |
torch.nn.AvgPool3d | Applies a 3D average pooling over an input signal composed of several input planes. |
torch.nn.FractionalMaxPool2d | Applies a 2D fractional max pooling over an input signal composed of several input planes. |
torch.nn.FractionalMaxPool3d | Applies a 3D fractional max pooling over an input signal composed of several input planes. |
torch.nn.LPPool1d | Applies a 1D power-average pooling over an input signal composed of several input planes. |
torch.nn.LPPool2d | Applies a 2D power-average pooling over an input signal composed of several input planes. |
torch.nn.LPPool3d | Applies a 3D power-average pooling over an input signal composed of several input planes. |
torch.nn.AdaptiveMaxPool1d | Applies a 1D adaptive max pooling over an input signal composed of several input planes. |
torch.nn.AdaptiveMaxPool2d | Applies a 2D adaptive max pooling over an input signal composed of several input planes. |
torch.nn.AdaptiveMaxPool3d | Applies a 3D adaptive max pooling over an input signal composed of several input planes. |
torch.nn.AdaptiveAvgPool1d | Applies a 1D adaptive average pooling over an input signal composed of several input planes. |
torch.nn.AdaptiveAvgPool2d | Applies a 2D adaptive average pooling over an input signal composed of several input planes. |
torch.nn.AdaptiveAvgPool3d | Applies a 3D adaptive average pooling over an input signal composed of several input planes. |
Padding Layers
torch.nn.ReflectionPad1d | Pads the input tensor using the reflection of the input boundary. |
torch.nn.ReflectionPad2d | Pads the input tensor using the reflection of the input boundary. |
torch.nn.ReflectionPad3d | Pads the input tensor using the reflection of the input boundary. |
torch.nn.ReplicationPad1d | Pads the input tensor using replication of the input boundary. |
torch.nn.ReplicationPad2d | Pads the input tensor using replication of the input boundary. |
torch.nn.ReplicationPad3d | Pads the input tensor using replication of the input boundary. |
torch.nn.ZeroPad1d | Pads the input tensor boundaries with zero. |
torch.nn.ZeroPad2d | Pads the input tensor boundaries with zero. |
torch.nn.ZeroPad3d | Pads the input tensor boundaries with zero. |
torch.nn.ConstantPad1d | Pads the input tensor boundaries with a constant value. |
torch.nn.ConstantPad2d | Pads the input tensor boundaries with a constant value. |
torch.nn.ConstantPad3d | Pads the input tensor boundaries with a constant value. |
torch.nn.CircularPad1d | Pads the input tensor using circular padding of the input boundary. |
torch.nn.CircularPad2d | Pads the input tensor using circular padding of the input boundary. |
torch.nn.CircularPad3d | Pads the input tensor using circular padding of the input boundary. |
Non-linear Activations (weighted sum, nonlinearity)
torch.nn.ELU | Applies the Exponential Linear Unit (ELU) function, element-wise. |
torch.nn.Hardshrink | Applies the Hard Shrinkage (Hardshrink) function element-wise. |
torch.nn.Hardsigmoid | Applies the Hardsigmoid function element-wise. |
torch.nn.Hardtanh | Applies the HardTanh function element-wise. |
torch.nn.Hardswish | Applies the Hardswish function, element-wise. |
torch.nn.LeakyReLU | Applies the LeakyReLU function element-wise. |
torch.nn.LogSigmoid | Applies the Logsigmoid function element-wise. |
torch.nn.MultiheadAttention | Allows the model to jointly attend to information from different representation subspaces. |
torch.nn.PReLU | Applies the element-wise PReLU function. |
torch.nn.ReLU | Applies the rectified linear unit function element-wise. |
torch.nn.ReLU6 | Applies the ReLU6 function element-wise. |
torch.nn.RReLU | Applies the randomized leaky rectified linear unit function, element-wise. |
torch.nn.SELU | Applies the SELU function element-wise. |
torch.nn.CELU | Applies the CELU function element-wise. |
torch.nn.GELU | Applies the Gaussian Error Linear Units function. |
torch.nn.Sigmoid | Applies the Sigmoid function element-wise. |
torch.nn.SiLU | Applies the Sigmoid Linear Unit (SiLU) function, element-wise. |
torch.nn.Mish | Applies the Mish function, element-wise. |
torch.nn.Softplus | Applies the Softplus function element-wise. |
torch.nn.Softshrink | Applies the soft shrinkage function element-wise. |
torch.nn.Softsign | Applies the element-wise Softsign function. |
torch.nn.Tanh | Applies the Hyperbolic Tangent (Tanh) function element-wise. |
torch.nn.Tanhshrink | Applies the element-wise Tanhshrink function. |
torch.nn.Threshold | Thresholds each element of the input Tensor. |
torch.nn.GLU | Applies the gated linear unit function. |
Non-linear Activations (other)
torch.nn.Softmin | Applies the Softmin function to an n-dimensional input Tensor. |
torch.nn.Softmax | Applies the Softmax function to an n-dimensional input Tensor. |
torch.nn.Softmax2d | Applies SoftMax over features to each spatial location. |
torch.nn.LogSoftmax | Applies the log(Softmax(x))log(Softmax(x)) function to an n-dimensional input Tensor. |
torch.nn.AdaptiveLogSoftmaxWithLoss | Efficient softmax approximation. |
Normalization Layers
torch.nn.BatchNorm1d | Applies Batch Normalization over a 2D or 3D input. |
torch.nn.BatchNorm2d | Applies Batch Normalization over a 4D input. |
torch.nn.BatchNorm3d | Applies Batch Normalization over a 5D input. |
torch.nn.LazyBatchNorm1d | A torch.nn.BatchNorm1d module with lazy initialization. |
torch.nn.LazyBatchNorm2d | A torch.nn.BatchNorm2d module with lazy initialization. |
torch.nn.LazyBatchNorm3d | A torch.nn.BatchNorm3d module with lazy initialization. |
torch.nn.GroupNorm | Applies Group Normalization over a mini-batch of inputs. |
torch.nn.SyncBatchNorm | Applies Batch Normalization over a N-Dimensional input. |
torch.nn.InstanceNorm1d | Applies Instance Normalization. |
torch.nn.InstanceNorm2d | Applies Instance Normalization. |
torch.nn.InstanceNorm3d | Applies Instance Normalization. |
torch.nn.LazyInstanceNorm1d | A torch.nn.InstanceNorm1d module with lazy initialization of the num_features argument. |
torch.nn.LazyInstanceNorm2d | A torch.nn.InstanceNorm2d module with lazy initialization of the num_features argument. |
torch.nn.LazyInstanceNorm3d | A torch.nn.InstanceNorm3d module with lazy initialization of the num_features argument. |
torch.nn.LayerNorm | Applies Layer Normalization over a mini-batch of inputs. |
torch.nn.LocalResponseNorm | Applies local response normalization over an input signal. |
Recurrent Layers
torch.nn.RNNBase | Base class for RNN modules (RNN, LSTM, GRU). |
torch.nn.RNN | Apply a multi-layer Elman RNN with tanhtanh or ReLUReLU non-linearity to an input sequence. |
torch.nn.LSTM | Apply a multi-layer long short-term memory (LSTM) RNN to an input sequence. |
torch.nn.GRU | Apply a multi-layer gated recurrent unit (GRU) RNN to an input sequence. |
torch.nn.RNNCell | An Elman RNN cell with tanh or ReLU non-linearity. |
torch.nn.LSTMCell | A long short-term memory (LSTM) cell. |
torch.nn.GRUCell | A gated recurrent unit (GRU) cell. |
Transformer Layers
torch.nn.Transformer | A transformer model. |
torch.nn.TransformerEncoder | TransformerEncoder is a stack of N encoder layers. |
torch.nn.TransformerDecoder | TransformerDecoder is a stack of N decoder layers. |
torch.nn.TransformerEncoderLayer | TransformerEncoderLayer is made up of self-attn and feedforward network. |
torch.nn.TransformerDecoderLayer | TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network. |
Linear Layers
torch.nn.Identity | A placeholder identity operator that is argument-insensitive. |
Applies a linear transformation to the incoming data: \(y=xA^{T}+b\) | |
torch.nn.Bilinear | Applies a bilinear transformation to the incoming data: \(y=x^{x}{1}Ax_{2}+b\) |
torch.nn.LazyLinear | A torch.nn.Linear module where in_features is inferred. |
Dropout Layers
torch.nn.Dropout | During training, randomly zeroes some of the elements of the input tensor with probability p. |
torch.nn.Dropout1d | Randomly zero out entire channels. |
torch.nn.Dropout2d | Randomly zero out entire channels. |
torch.nn.Dropout3d | Randomly zero out entire channels. |
torch.nn.AlphaDropout | Applies Alpha Dropout over the input. |
torch.nn.FeatureAlphaDropout | Randomly masks out entire channels. |
Sparse Layers
torch.nn.Embedding | A simple lookup table that stores embeddings of a fixed dictionary and size. |
torch.nn.EmbeddingBag | Compute sums or means of 'bags' of embeddings, without instantiating the intermediate embeddings. |
Distance Functions
torch.nn.CosineSimilarity | Returns cosine similarity between x1x1 and x2x2, computed along dim. |
torch.nn.PairwiseDistance | Computes the pairwise distance between input vectors, or between columns of input matrices. |
Loss Functions
torch.nn.L1Loss | Creates a criterion that measures the mean absolute error (MAE) between each element in the input xx and target yy. |
Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input xx and target yy. | |
torch.nn.CrossEntropyLoss | This criterion computes the cross entropy loss between input logits and target. |
torch.nn.CTCLoss | The Connectionist Temporal Classification loss. |
torch.nn.NLLLoss | The negative log likelihood loss. |
torch.nn.PoissonNLLLoss | Negative log likelihood loss with Poisson distribution of target. |
torch.nn.GaussianNLLLoss | Gaussian negative log likelihood loss. |
torch.nn.KLDivLoss | The Kullback-Leibler divergence loss. |
torch.nn.BCELoss | Creates a criterion that measures the Binary Cross Entropy between the target and the input probabilities: |
torch.nn.BCEWithLogitsLoss | This loss combines a Sigmoid layer and the BCELoss in one single class. |
torch.nn.MarginRankingLoss | Creates a criterion that measures the loss given inputs x1x1, x2x2, two 1D mini-batch or 0D Tensors, and a label 1D mini-batch or 0D Tensor yy (containing 1 or -1). |
torch.nn.HingeEmbeddingLoss | Measures the loss given an input tensor xx and a labels tensor yy (containing 1 or -1). |
torch.nn.MultiLabelMarginLoss | Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input xx (a 2D mini-batch Tensor) and output yy (which is a 2D Tensor of target class indices). |
torch.nn.HuberLoss | Creates a criterion that uses a squared term if the absolute element-wise error falls below delta and a delta-scaled L1 term otherwise. |
torch.nn.SmoothL1Loss | Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise. |
torch.nn.SoftMarginLoss | Creates a criterion that optimizes a two-class classification logistic loss between input tensor xx and target tensor yy (containing 1 or -1). |
torch.nn.MultiLabelSoftMarginLoss | Creates a criterion that optimizes a multi-label one-versus-all loss based on max-entropy, between input xx and target yy of size (N,C)(N,C). |
torch.nn.CosineEmbeddingLoss | Creates a criterion that measures the loss given input tensors x1x1, x2x2 and a Tensor label yy with values 1 or -1. |
torch.nn.MultiMarginLoss | Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input xx (a 2D mini-batch Tensor) and output yy (which is a 1D tensor of target class indices, 0≤y≤x.size(1)−10≤y≤x.size(1)−1): |
torch.nn.TripletMarginLoss | Creates a criterion that measures the triplet loss given an input tensors x1x1, x2x2, x3x3 and a margin with a value greater than 00. |
torch.nn.TripletMarginWithDistanceLoss | Creates a criterion that measures the triplet loss given input tensors aa, pp, and nn (representing anchor, positive, and negative examples, respectively), and a nonnegative, real-valued function ("distance function") used to compute the relationship between the anchor and positive example ("positive distance") and the anchor and negative example ("negative distance"). |
Vision Layers
torch.nn.PixelShuffle | Rearrange elements in a tensor according to an upscaling factor. |
torch.nn.PixelUnshuffle | Reverse the PixelShuffle operation. |
torch.nn.Upsample | Upsamples a given multi-channel 1D (temporal), 2D (spatial) or 3D (volumetric) data. |
torch.nn.UpsamplingNearest2d | Applies a 2D nearest neighbor upsampling to an input signal composed of several input channels. |
torch.nn.UpsamplingBilinear2d | Applies a 2D bilinear upsampling to an input signal composed of several input channels. |
Shuffle Layers
torch.nn.ChannelShuffle | Divides and rearranges the channels in a tensor. |
DataParallel Layers (multi-GPU, distributed)
torch.nn.DataParallel | Implements data parallelism at the module level. |
torch.nn.parallel.DistributedDataParallel | Implement distributed data parallelism based on torch.distributed at module level. |
Utilities
From the torch.nn.utils module:
Utility functions to clip parameter gradients.
torch.nn.utils.clip_grad_norm_ | Clip the gradient norm of an iterable of parameters. |
torch.nn.utils.clip_grad_norm | Clip the gradient norm of an iterable of parameters. |
torch.nn.utils.clip_grad_value_ | Clip the gradients of an iterable of parameters at specified value. |
Utility functions to flatten and unflatten Module parameters to and from a single vector.
torch.nn.utils.parameters_to_vector | Flatten an iterable of parameters into a single vector. |
torch.nn.utils.vector_to_parameters | Copy slices of a vector into an iterable of parameters. |
Utility functions to fuse Modules with BatchNorm modules.
torch.nn.utils.fuse_conv_bn_eval | Fuse a convolutional module and a BatchNorm module into a single, new convolutional module. |
torch.nn.utils.fuse_conv_bn_weights | Fuse convolutional module parameters and BatchNorm module parameters into new convolutional module parameters. |
torch.nn.utils.fuse_linear_bn_eval | Fuse a linear module and a BatchNorm module into a single, new linear module. |
torch.nn.utils.fuse_linear_bn_weights | Fuse linear module parameters and BatchNorm module parameters into new linear module parameters. |
Utility functions to convert Module parameter memory formats.
torch.nn.utils.convert_conv2d_weight_memory_format | Convert memory_format of nn.Conv2d.weight to memory_format. |
torch.nn.utils.convert_conv3d_weight_memory_format | Convert memory_format of nn.Conv3d.weight to memory_format The conversion recursively applies to nested nn.Module, including module. |
Utility functions to apply and remove weight normalization from Module parameters.
torch.nn.utils.weight_norm | Apply weight normalization to a parameter in the given module. |
torch.nn.utils.remove_weight_norm | Remove the weight normalization reparameterization from a module. |
torch.nn.utils.spectral_norm | Apply spectral normalization to a parameter in the given module. |
torch.nn.utils.remove_spectral_norm | Remove the spectral normalization reparameterization from a module. |
Utility functions for initializing Module parameters.
torch.nn.utils.skip_init | Given a module class object and args / kwargs, instantiate the module without initializing parameters / buffers. |
Utility classes and functions for pruning Module parameters.
torch.nn.utils.prune.BasePruningMethod | Abstract base class for creation of new pruning techniques. |
torch.nn.utils.prune.PruningContainer | Container holding a sequence of pruning methods for iterative pruning. |
torch.nn.utils.prune.Identity | Utility pruning method that does not prune any units but generates the pruning parametrization with a mask of ones. |
torch.nn.utils.prune.RandomUnstructured | Prune (currently unpruned) units in a tensor at random. |
torch.nn.utils.prune.L1Unstructured | Prune (currently unpruned) units in a tensor by zeroing out the ones with the lowest L1-norm. |
torch.nn.utils.prune.RandomStructured | Prune entire (currently unpruned) channels in a tensor at random. |
torch.nn.utils.prune.LnStructured | Prune entire (currently unpruned) channels in a tensor based on their Ln-norm. |
torch.nn.utils.prune.CustomFromMask | |
torch.nn.utils.prune.identity | Apply pruning reparametrization without pruning any units. |
torch.nn.utils.prune.random_unstructured | Prune tensor by removing random (currently unpruned) units. |
torch.nn.utils.prune.l1_unstructured | Prune tensor by removing units with the lowest L1-norm. |
torch.nn.utils.prune.random_structured | Prune tensor by removing random channels along the specified dimension. |
torch.nn.utils.prune.ln_structured | Prune tensor by removing channels with the lowest Ln-norm along the specified dimension. |
torch.nn.utils.prune.global_unstructured | Globally prunes tensors corresponding to all parameters in parameters by applying the specified pruning_method. |
torch.nn.utils.prune.custom_from_mask | Prune tensor corresponding to parameter called name in module by applying the pre-computed mask in mask. |
torch.nn.utils.prune.remove | Remove the pruning reparameterization from a module and the pruning method from the forward hook. |
torch.nn.utils.prune.is_pruned | Check if a module is pruned by looking for pruning pre-hooks. |
Parametrizations implemented using the new parametrization functionality in torch.nn.utils.parameterize.register_parametrization()
.
torch.nn.utils.parametrizations.orthogonal | Apply an orthogonal or unitary parametrization to a matrix or a batch of matrices. |
torch.nn.utils.parametrizations.weight_norm | Apply weight normalization to a parameter in the given module. |
torch.nn.utils.parametrizations.spectral_norm | Apply spectral normalization to a parameter in the given module. |
Utility functions to parametrize Tensors on existing Modules. Note that these functions can be used to parametrize a given Parameter or Buffer given a specific function that maps from an input space to the parametrized space. They are not parameterizations that would transform an object into a parameter. See the Parametrizations tutorial for more information on how to implement your own parametrizations.
torch.nn.utils.parametrize.register_parametrization | Register a parametrization to a tensor in a module. |
torch.nn.utils.parametrize.remove_parametrizations | Remove the parametrizations on a tensor in a module. |
torch.nn.utils.parametrize.cached | Context manager that enables the caching system within parametrizations registered with register_parametrization(). |
torch.nn.utils.parametrize.is_parametrized | Determine if a module has a parametrization. |
torch.nn.utils.parametrize.ParametrizationList | A sequential container that holds and manages the original parameters or buffers of a parametrized torch.nn.Module. |
Utility functions to call a given Module in a stateless manner.
torch.nn.utils.stateless.functional_call | Perform a functional call on the module by replacing the module parameters and buffers with the provided ones. |
Utility functions in other modules
torch.nn.utils.rnn.PackedSequence | Holds the data and list of batch_sizes of a packed sequence. |
torch.nn.utils.rnn.pack_padded_sequence | Packs a Tensor containing padded sequences of variable length. |
torch.nn.utils.rnn.pad_packed_sequence | Pad a packed batch of variable length sequences. |
torch.nn.utils.rnn.pad_sequence | Pad a list of variable length Tensors with padding_value. |
torch.nn.utils.rnn.pack_sequence | Packs a list of variable length Tensors. |
torch.nn.utils.rnn.unpack_sequence | Unpack PackedSequence into a list of variable length Tensors. |
torch.nn.utils.rnn.unpad_sequence | Unpad padded Tensor into a list of variable length Tensors. |
torch.nn.Flatten | Flattens a contiguous range of dims into a tensor. |
torch.nn.Unflatten | Unflattens a tensor dim expanding it to a desired shape. |
Quantized Functions
Quantization refers to techniques for performing computations and storing tensors at lower bitwidths than floating point precision. PyTorch supports both per tensor and per channel asymmetric linear quantization. To learn more how to use quantized functions in PyTorch, please refer to the Quantization documentation.
Lazy Modules Initialization
torch.nn.modules.lazy.LazyModuleMixin | A mixin for modules that lazily initialize parameters, also known as "lazy modules". |
torch.nn.parameter
torch.nn.parameter.Parameter | A kind of Tensor that is to be considered a module parameter. |
torch.nn.parameter.UninitializedParameter | A parameter that is not initialized. |
torch.nn.parameter.UninitializedBuffer | A buffer that is not initialized. |