MixedInputter
- class opennmt.inputters.MixedInputter(*args, **kwargs)[source]
An multi inputter that applies several transformation on the same data (e.g. combine word-level and character-level embeddings).
Inherits from:
opennmt.inputters.MultiInputter
- __init__(inputters, reducer=<opennmt.layers.reducer.ConcatReducer object>, dropout=0.0)[source]
Initializes a mixed inputter.
- Parameters
inputters – A list of
opennmt.inputters.Inputter
.reducer – A
opennmt.layers.Reducer
to merge all inputs.dropout – The probability to drop units in the merged inputs.
- make_dataset(data_file, training=None)[source]
Creates the base dataset required by this inputter.
- Parameters
data_file – The data file.
training – Run in training mode.
- Returns
A
tf.data.Dataset
instance or a list oftf.data.Dataset
instances.
- get_dataset_size(data_file)[source]
Returns the dataset size.
If the inputter can efficiently compute the dataset size from a training file on disk, it can optionally override this method. Otherwise, we may compute the size later with a generic and slower approach (iterating over the dataset instance).
- Parameters
data_file – The data file.
- Returns
The dataset size or
None
.
- get_length(features, ignore_special_tokens=False)[source]
Returns the length of the input features, if defined.
- Parameters
features – The dictionary of input features.
ignore_special_tokens – Ignore special tokens that were added by the inputter (e.g. <s> and/or </s>).
- Returns
The length.
- make_features(element=None, features=None, training=None)[source]
Creates features from data.
This is typically called in a data pipeline (such as
Dataset.map
). Common transformation includes tokenization, parsing, vocabulary lookup, etc.This method accepts both a single
element
from the dataset or a partially built dictionary offeatures
.- Parameters
element – An element from the dataset returned by
opennmt.inputters.Inputter.make_dataset()
.features – An optional and possibly partial dictionary of features to augment.
training – Run in training mode.
- Returns
A dictionary of
tf.Tensor
.
- build(input_shape)[source]
Creates the variables of the layer (for subclass implementers).
This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().
This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).
- Parameters
input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).
- call(features, training=None)[source]
Creates the model input from the features (e.g. word embeddings).
- Parameters
features – A dictionary of
tf.Tensor
, the output ofopennmt.inputters.Inputter.make_features()
.training – Run in training mode.
- Returns
The model input.