opennmt.inputters.record_inputter module

Define inputters reading from TFRecord files.

class opennmt.inputters.record_inputter.SequenceRecordInputter(dtype=tf.float32)[source]

Bases: opennmt.inputters.inputter.Inputter

Inputter that reads variable-length tensors.

Each record contains the following fields:

  • shape: the shape of the tensor as a int64 list.
  • values: the flattened tensor values as a dtype list.

Tensors are expected to be of shape [time, depth].

__init__(dtype=tf.float32)[source]

Initializes the parameters of the record inputter.

Parameters:dtype – The values type.
make_dataset(data_file, training=None)[source]

Creates the base dataset required by this inputter.

Parameters:
  • data_file – The data file.
  • training – Run in training mode.
Returns:

A tf.data.Dataset.

get_dataset_size(data_file)[source]

Returns the size of the dataset.

Parameters:data_file – The data file.
Returns:The total size.
get_receiver_tensors()[source]

Returns the input placeholders for serving.

make_features(element=None, features=None, training=None)[source]

Creates features from data.

Parameters:
  • element – An element from the dataset.
  • features – An optional dictionary of features to augment.
  • training – Run in training mode.
Returns:

A dictionary of tf.Tensor.

make_inputs(features, training=None)[source]

Creates the model input from the features.

Parameters:
  • features – A dictionary of tf.Tensor.
  • training – Run in training mode.
Returns:

The model input.

opennmt.inputters.record_inputter.write_sequence_record(vector, writer)[source]

Writes a vector as a TFRecord.

Parameters:
  • vector – A 2D Numpy float array.
  • writer – A tf.python_io.TFRecordWriter.