opennmt.layers.position module

Define position encoder classes.

opennmt.layers.position.make_positions(sequence_length, maximum_length=None)[source]

Builds a sequence of positions.

The first position is 1 as the 0 index is reserved to padding positions.

Parameters:
  • sequence_length – The length of each sequence as a tf.Tensor of shape \([B]\).
  • maximum_length – Optional size of the returned time dimension. Otherwise it is the maximum of sequence_length.
Returns:

The sequence of positions as a tf.Tensor of shape \([B, T]\).

class opennmt.layers.position.PositionEncoder(reducer=<opennmt.layers.reducer.SumReducer object>)[source]

Bases: tensorflow.python.keras.engine.base_layer.Layer

Base class for position encoders.

__call__(inputs, sequence_length=None, position=None)[source]

Apply position encoding to inputs.

Parameters:
  • inputs – The inputs of shape \([B, T, D]\).
  • sequence_length – The length of each sequence of shape \([B]\). If None, sequences are assumed to have the same length.
  • position – If known, the position to encode (1-indexed).
Returns:

A tf.Tensor of shape \([B, T, D]\) where \(D\) depends on the reducer.

call(inputs, sequence_length=None, position=None)[source]

This is where the layer’s logic lives.

Parameters:
  • inputs – Input tensor, or list/tuple of input tensors.
  • **kwargs – Additional keyword arguments.
Returns:

A tensor or list/tuple of tensors.

apply(inputs, sequence_length=None)[source]

Shortcut for __call__.

apply_one(inputs, position)[source]

Shortcut for __call__.

encode(positions, depth, dtype=tf.float32)[source]

Creates position encodings.

Parameters:
  • position – The positions to encode of shape \([B, ...]\).
  • depth – The encoding depth \(D\).
  • dtype – The encoding type.
Returns:

A tf.Tensor of shape \([B, ..., D]\).

encode_sequence(sequence_length, depth, maximum_length=None, dtype=tf.float32)[source]

Creates position encodings for sequences.

Parameters:
  • sequence_length – The length of each sequence of shape \([B]\).
  • depth – The encoding depth \(D\).
  • maximum_length – Optional size of the returned time dimension. Otherwise it is the maximum of sequence_length.
  • dtype – The encoding type.
Returns:

A tf.Tensor of shape \([B, T, D]\).

class opennmt.layers.position.PositionEmbedder(maximum_position=128, reducer=<opennmt.layers.reducer.SumReducer object>)[source]

Bases: opennmt.layers.position.PositionEncoder

Encodes position with a lookup table.

__init__(maximum_position=128, reducer=<opennmt.layers.reducer.SumReducer object>)[source]

Initializes the position encoder.

Parameters:
  • maximum_position – The maximum position to embed. Positions greater than this value will be set to maximum_position.
  • reducer – A opennmt.layers.reducer.Reducer to merge inputs and position encodings.
build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).
encode(positions, depth, dtype=tf.float32)[source]

Creates position encodings.

Parameters:
  • position – The positions to encode of shape \([B, ...]\).
  • depth – The encoding depth \(D\).
  • dtype – The encoding type.
Returns:

A tf.Tensor of shape \([B, ..., D]\).

class opennmt.layers.position.SinusoidalPositionEncoder(reducer=<opennmt.layers.reducer.SumReducer object>)[source]

Bases: opennmt.layers.position.PositionEncoder

Encodes positions with sine waves as described in https://arxiv.org/abs/1706.03762.

encode(positions, depth, dtype=tf.float32)[source]

Creates position encodings.

Parameters:
  • position – The positions to encode of shape \([B, ...]\).
  • depth – The encoding depth \(D\).
  • dtype – The encoding type.
Returns:

A tf.Tensor of shape \([B, ..., D]\).