WordEmbedder

class opennmt.inputters.WordEmbedder(*args, **kwargs)[source]

Simple word embedder.

Inherits from: opennmt.inputters.TextInputter

__init__(embedding_size=None, dropout=0.0, **kwargs)[source]

Initializes the parameters of the word embedder.

Parameters
  • embedding_size – The size of the resulting embedding. If None, an embedding file must be provided.

  • dropout – The probability to drop units in the embedding.

  • **kwargs – Additional layer keyword arguments.

set_decoder_mode(enable=True, mark_start=None, mark_end=None)[source]

Make this inputter produce sequences for a decoder.

In this mode, the returned “ids_out” feature is the decoder output sequence and “ids” is the decoder input sequence.

Parameters
  • enable – Enable the decoder mode.

  • mark_start – Mark the sequence start. If None, keep the current value.

  • mark_end – Mark the sequence end. If None, keep the current value.

get_length(features, ignore_special_tokens=False)[source]

Returns the length of the input features, if defined.

Parameters
  • features – The dictionary of input features.

  • ignore_special_tokens – Ignore special tokens that were added by the inputter (e.g. <s> and/or </s>).

Returns

The length.

get_oov_tokens(features)[source]
initialize(data_config)[source]

Initializes the inputter.

Parameters

data_config – A dictionary containing the data configuration set by the user.

make_features(element=None, features=None, training=None)[source]

Converts words tokens to ids.

build(input_shape)[source]

Creates the variables of the layer (for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(features, training=None)[source]

Creates the model input from the features (e.g. word embeddings).

Parameters
Returns

The model input.

visualize(model_root, log_dir)[source]

Visualizes the transformation, usually embeddings.

Parameters
  • model_root – The root model object.

  • log_dir – The active log directory.

map_v1_weights(weights)[source]