Decoder
- class opennmt.decoders.Decoder(*args, **kwargs)[source]
Base class for decoders.
Inherits from:
keras.src.engine.base_layer.LayerExtended by:
- __init__(num_sources=1, vocab_size=None, output_layer=None, output_layer_bias=True, **kwargs)[source]
Initializes the decoder parameters.
If you don’t set one of
vocab_sizeoroutput_layerhere, you should later call the methodopennmt.decoders.Decoder.initialize()to initialize this decoder instance.- Parameters
num_sources – The number of source contexts expected by this decoder.
vocab_size – The output vocabulary size (optional if
output_layeris set).output_layer – The output projection layer (optional).
output_layer_bias – Add bias after the output projection layer.
**kwargs – Additional layer arguments.
- Raises
ValueError – if the number of source contexts
num_sourcesis not supported by this decoder.
- property minimum_sources
The minimum number of source contexts supported by this decoder.
- property maximum_sources
The maximum number of source contexts supported by this decoder.
- property support_alignment_history
Returns
Trueif this decoder can return the attention as alignment history.
- property initialized
Returns
Trueif this decoder is initialized.
- initialize(vocab_size=None, output_layer=None)[source]
Initializes the decoder configuration.
- Parameters
vocab_size – The target vocabulary size.
output_layer – The output layer to use.
- Raises
ValueError – if both
vocab_sizeandoutput_layerare not set.
- reuse_embeddings(embeddings)[source]
Reuses embeddings in the decoder output layer.
- Parameters
embeddings – The embeddings matrix to reuse.
- Raises
RuntimeError – if the decoder was not initialized.
- initial_state(memory=None, memory_sequence_length=None, initial_state=None, batch_size=None, dtype=None)[source]
Returns the initial decoder state.
- Parameters
memory – Memory values to query.
memory_sequence_length – Memory values length.
initial_state – An initial state to start from, e.g. the last encoder state.
batch_size – The batch size to use.
dtype – The dtype of the state.
- Returns
A nested structure of tensors representing the decoder state.
- Raises
RuntimeError – if the decoder was not initialized.
ValueError – if one of
batch_sizeordtypeis not set and neitherinitial_statenormemoryare not passed.ValueError – if the number of source contexts (
memory) does not match the number defined at the decoder initialization.
- call(inputs, length_or_step=None, state=None, input_fn=None, sampling_probability=None, training=None)[source]
Runs the decoder layer on either a complete sequence (e.g. for training or scoring), or a single timestep (e.g. for iterative decoding).
- Parameters
inputs – The inputs to decode, can be a 3D (training) or 2D (iterative decoding) tensor.
length_or_step – For 3D
inputs, the length of each sequence. For 2Dinputs, the current decoding timestep.state – The decoder state.
input_fn – A callable taking sampled ids and returning the decoding inputs.
sampling_probability – When
inputsis the full sequence, the probability to read from the last sample instead of the true target.training – Run in training mode.
- Returns
A tuple with the logits, the decoder state, and an attention vector.
- Raises
RuntimeError – if the decoder was not initialized.
ValueError – if the
inputsrank is different than 2 or 3.ValueError – if
length_or_stepis invalid.
- forward(inputs, sequence_length=None, initial_state=None, memory=None, memory_sequence_length=None, input_fn=None, sampling_probability=None, training=None)[source]
Runs the decoder on full sequences.
- Parameters
inputs – The 3D decoder input.
sequence_length – The length of each input sequence.
initial_state – The initial decoder state.
memory – Memory values to query.
memory_sequence_length – Memory values length.
input_fn – A callable taking sampled ids and returning the decoding inputs.
sampling_probability – The probability to read from the last sample instead of the true target.
training – Run in training mode.
- Returns
A tuple with the logits, the decoder state, and the attention vector.
- abstract step(inputs, timestep, state=None, memory=None, memory_sequence_length=None, training=None)[source]
Runs one decoding step.
- Parameters
inputs – The 2D decoder input.
timestep – The current decoding step.
state – The decoder state.
memory – Memory values to query.
memory_sequence_length – Memory values length.
training – Run in training mode.
- Returns
A tuple with the decoder outputs, the decoder state, and the attention vector.
- dynamic_decode(embeddings, start_ids, end_id=2, initial_state=None, decoding_strategy=None, sampler=None, maximum_iterations=None, minimum_iterations=0, tflite_output_size=None)[source]
Decodes dynamically from
start_ids.- Parameters
embeddings – Target embeddings or
opennmt.inputters.WordEmbedderto apply on decoded ids.start_ids – Initial input IDs of shape \([B]\).
end_id – ID of the end of sequence token.
initial_state – Initial decoder state.
decoding_strategy – A
opennmt.utils.DecodingStrategyinstance that define the decoding logic. Defaults to a greedy search.sampler – A
opennmt.utils.Samplerinstance that samples predictions from the model output. Defaults to an argmax sampling.maximum_iterations – The maximum number of iterations to decode for.
minimum_iterations – The minimum number of iterations to decode for.
tflite_output_size – If not None will run TFLite safe, is the size of 1D output tensor.
- Returns
A
opennmt.utils.DecodingResultinstance.
See also