create_lookup_tables
- opennmt.data.create_lookup_tables(vocabulary_path, num_oov_buckets=1, as_asset=True, unk_token=None)[source]
Creates TensorFlow lookup tables from a vocabulary file.
- Parameters
vocabulary_path – Path to the vocabulary file.
num_oov_buckets – Number of out-of-vocabulary buckets.
as_asset – If
True
, the vocabulary file will be added as a graph asset. Otherwise, the content of the vocabulary will be embedded in the graph.unk_token – The out-of-vocabulary token. Defaults to
<unk>
.
- Returns
A tuple containing,
The final vocabulary size.
The
tf.lookup
table mapping tokens to ids.The
tf.lookup
table mapping ids to tokens.