MixedInputter

class opennmt.inputters.MixedInputter(*args, **kwargs)[source]

An multi inputter that applies several transformation on the same data (e.g. combine word-level and character-level embeddings).

Inherits from: opennmt.inputters.MultiInputter

__init__(inputters, reducer=<opennmt.layers.reducer.ConcatReducer object>, dropout=0.0)[source]

Initializes a mixed inputter.

Parameters
make_dataset(data_file, training=None)[source]

Creates the base dataset required by this inputter.

Parameters
  • data_file – The data file.

  • training – Run in training mode.

Returns

A tf.data.Dataset instance or a list of tf.data.Dataset instances.

get_dataset_size(data_file)[source]

Returns the dataset size.

If the inputter can efficiently compute the dataset size from a training file on disk, it can optionally override this method. Otherwise, we may compute the size later with a generic and slower approach (iterating over the dataset instance).

Parameters

data_file – The data file.

Returns

The dataset size or None.

input_signature()[source]

Returns the input signature of this inputter.

get_length(features, ignore_special_tokens=False)[source]

Returns the length of the input features, if defined.

Parameters
  • features – The dictionary of input features.

  • ignore_special_tokens – Ignore special tokens that were added by the inputter (e.g. <s> and/or </s>).

Returns

The length.

make_features(element=None, features=None, training=None)[source]

Creates features from data.

This is typically called in a data pipeline (such as Dataset.map). Common transformation includes tokenization, parsing, vocabulary lookup, etc.

This method accepts both a single element from the dataset or a partially built dictionary of features.

Parameters
  • element – An element from the dataset returned by opennmt.inputters.Inputter.make_dataset().

  • features – An optional and possibly partial dictionary of features to augment.

  • training – Run in training mode.

Returns

A dictionary of tf.Tensor.

build(input_shape)[source]

Creates the variables of the layer (for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(features, training=None)[source]

Creates the model input from the features (e.g. word embeddings).

Parameters
Returns

The model input.