SelfAttentionEncoder
- class opennmt.encoders.SelfAttentionEncoder(*args, **kwargs)[source]
- Encoder using self-attention as described in https://arxiv.org/abs/1706.03762. - Inherits from: - opennmt.encoders.Encoder- __init__(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, ffn_dropout=0.1, ffn_activation=<function relu>, mha_bias=True, position_encoder_class=<class 'opennmt.layers.position.SinusoidalPositionEncoder'>, maximum_relative_position=None, pre_norm=True, **kwargs)[source]
- Initializes the parameters of the encoder. - Parameters
- num_layers – The number of layers. 
- num_units – The number of hidden units. 
- num_heads – The number of heads in the multi-head attention. 
- ffn_inner_dim – The number of units of the inner linear transformation in the feed forward layer. 
- dropout – The probability to drop units from the outputs. 
- attention_dropout – The probability to drop units from the attention. 
- ffn_dropout – The probability to drop units from the activation output in the feed forward layer. 
- ffn_activation – The activation function to apply between the two linear transformations of the feed forward layer. 
- mha_bias – Add bias after linear layers in the multi-head attention. 
- position_encoder_class – The - opennmt.layers.PositionEncoderclass to use for position encoding (or a callable that returns an instance).
- maximum_relative_position – Maximum relative position representation (from https://arxiv.org/abs/1803.02155). 
- pre_norm – If - True, layer normalization is applied before each sub-layer. Otherwise it is applied after.
- **kwargs – Additional layer arguments.