opennmt.inputters.record_inputter module

Define inputters reading from TFRecord files.

class opennmt.inputters.record_inputter.SequenceRecordInputter(dtype=tf.float32)[source]

Bases: opennmt.inputters.inputter.Inputter

Inputter that reads variable-length tensors.

Each record contains the following fields:

  • shape: the shape of the tensor as a int64 list.
  • values: the flattened tensor values as a dtype list.

Tensors are expected to be of shape [time, depth].

__init__(dtype=tf.float32)[source]

Initializes the parameters of the record inputter.

Parameters:dtype – The values type.
get_length(data)[source]

Returns the length of the input data, if defined.

make_dataset(data_file)[source]

Creates the dataset required by this inputter.

Parameters:data_file – The data file.
Returns:A tf.data.Dataset.
get_dataset_size(data_file)[source]

Returns the size of the dataset.

Parameters:data_file – The data file.
Returns:The total size.
transform(inputs, mode)[source]

Transforms inputs.

Parameters:
  • inputs – A (possible nested structure of) tf.Tensor which depends on the inputter.
  • mode – A tf.estimator.ModeKeys mode.
Returns:

The transformed input.

opennmt.inputters.record_inputter.write_sequence_record(vector, writer)[source]

Writes a vector as a TFRecord.

Parameters:
  • vector – A 2D Numpy array.
  • writer – A tf.python_io.TFRecordWriter.