Decoder
- class opennmt.decoders.Decoder(*args, **kwargs)[source]
Base class for decoders.
Inherits from:
keras.src.engine.base_layer.Layer
Extended by:
- __init__(num_sources=1, vocab_size=None, output_layer=None, output_layer_bias=True, **kwargs)[source]
Initializes the decoder parameters.
If you don’t set one of
vocab_size
oroutput_layer
here, you should later call the methodopennmt.decoders.Decoder.initialize()
to initialize this decoder instance.- Parameters
num_sources – The number of source contexts expected by this decoder.
vocab_size – The output vocabulary size (optional if
output_layer
is set).output_layer – The output projection layer (optional).
output_layer_bias – Add bias after the output projection layer.
**kwargs – Additional layer arguments.
- Raises
ValueError – if the number of source contexts
num_sources
is not supported by this decoder.
- property minimum_sources
The minimum number of source contexts supported by this decoder.
- property maximum_sources
The maximum number of source contexts supported by this decoder.
- property support_alignment_history
Returns
True
if this decoder can return the attention as alignment history.
- property initialized
Returns
True
if this decoder is initialized.
- initialize(vocab_size=None, output_layer=None)[source]
Initializes the decoder configuration.
- Parameters
vocab_size – The target vocabulary size.
output_layer – The output layer to use.
- Raises
ValueError – if both
vocab_size
andoutput_layer
are not set.
- reuse_embeddings(embeddings)[source]
Reuses embeddings in the decoder output layer.
- Parameters
embeddings – The embeddings matrix to reuse.
- Raises
RuntimeError – if the decoder was not initialized.
- initial_state(memory=None, memory_sequence_length=None, initial_state=None, batch_size=None, dtype=None)[source]
Returns the initial decoder state.
- Parameters
memory – Memory values to query.
memory_sequence_length – Memory values length.
initial_state – An initial state to start from, e.g. the last encoder state.
batch_size – The batch size to use.
dtype – The dtype of the state.
- Returns
A nested structure of tensors representing the decoder state.
- Raises
RuntimeError – if the decoder was not initialized.
ValueError – if one of
batch_size
ordtype
is not set and neitherinitial_state
normemory
are not passed.ValueError – if the number of source contexts (
memory
) does not match the number defined at the decoder initialization.
- call(inputs, length_or_step=None, state=None, input_fn=None, sampling_probability=None, training=None)[source]
Runs the decoder layer on either a complete sequence (e.g. for training or scoring), or a single timestep (e.g. for iterative decoding).
- Parameters
inputs – The inputs to decode, can be a 3D (training) or 2D (iterative decoding) tensor.
length_or_step – For 3D
inputs
, the length of each sequence. For 2Dinputs
, the current decoding timestep.state – The decoder state.
input_fn – A callable taking sampled ids and returning the decoding inputs.
sampling_probability – When
inputs
is the full sequence, the probability to read from the last sample instead of the true target.training – Run in training mode.
- Returns
A tuple with the logits, the decoder state, and an attention vector.
- Raises
RuntimeError – if the decoder was not initialized.
ValueError – if the
inputs
rank is different than 2 or 3.ValueError – if
length_or_step
is invalid.
- forward(inputs, sequence_length=None, initial_state=None, memory=None, memory_sequence_length=None, input_fn=None, sampling_probability=None, training=None)[source]
Runs the decoder on full sequences.
- Parameters
inputs – The 3D decoder input.
sequence_length – The length of each input sequence.
initial_state – The initial decoder state.
memory – Memory values to query.
memory_sequence_length – Memory values length.
input_fn – A callable taking sampled ids and returning the decoding inputs.
sampling_probability – The probability to read from the last sample instead of the true target.
training – Run in training mode.
- Returns
A tuple with the logits, the decoder state, and the attention vector.
- abstract step(inputs, timestep, state=None, memory=None, memory_sequence_length=None, training=None)[source]
Runs one decoding step.
- Parameters
inputs – The 2D decoder input.
timestep – The current decoding step.
state – The decoder state.
memory – Memory values to query.
memory_sequence_length – Memory values length.
training – Run in training mode.
- Returns
A tuple with the decoder outputs, the decoder state, and the attention vector.
- dynamic_decode(embeddings, start_ids, end_id=2, initial_state=None, decoding_strategy=None, sampler=None, maximum_iterations=None, minimum_iterations=0, tflite_output_size=None)[source]
Decodes dynamically from
start_ids
.- Parameters
embeddings – Target embeddings or
opennmt.inputters.WordEmbedder
to apply on decoded ids.start_ids – Initial input IDs of shape \([B]\).
end_id – ID of the end of sequence token.
initial_state – Initial decoder state.
decoding_strategy – A
opennmt.utils.DecodingStrategy
instance that define the decoding logic. Defaults to a greedy search.sampler – A
opennmt.utils.Sampler
instance that samples predictions from the model output. Defaults to an argmax sampling.maximum_iterations – The maximum number of iterations to decode for.
minimum_iterations – The minimum number of iterations to decode for.
tflite_output_size – If not None will run TFLite safe, is the size of 1D output tensor.
- Returns
A
opennmt.utils.DecodingResult
instance.
See also