2.0 Transition Guide
This document details some changes introduced in OpenNMT-tf 2.0 and actions required by the user.
New requirements
Python 3
Python 3.5 (or above) is now required by OpenNMT-tf. See python3statement.org for more context about this decision.
TensorFlow 2.0
OpenNMT-tf has been completely redesigned for TensorFlow 2.0 which is now the minimal required TensorFlow version.
The correct TensorFlow version is declared as a dependency of OpenNMT-tf and will be automatically installed as part of the pip
installation:
pip install --upgrade pip
pip install OpenNMT-tf
Changed checkpoint layout
TensorFlow 2.0 introduced a new way to save checkpoints: variables are no longer matched by their name but by where they are stored relative to the root model object. Consequently, OpenNMT-tf V1 checkpoints are no longer compatible without conversion.
To smooth this transition, V1 checkpoints of the following models are automatically upgraded on load:
NMTBigV1
NMTMediumV1
NMTSmallV1
Transformer
TransformerBig
Improved main script command line
The command line parser has been improved to better manage task specifc options that are now located after the run type:
onmt-main <general options> train <train options>
Also, some options have changed:
the run type
train_and_eval
has been replaced bytrain --with_eval
the main script now includes the
average_checkpoints
andupdate_vocab
tasksdistributed training options are currently missing in 2.0
--session_config
has been removed as no longer applicable in TensorFlow 2.0
See onmt-main -h
for more details.
Changed vocabulary configuration
In OpenNMT-tf V1, models were responsible to declare the name of the vocabulary to look for, e.g.:
# V1 model inputter.
source_inputter=onmt.inputters.WordEmbedder(
vocabulary_file_key="source_words_vocabulary",
embedding_size=512)
meant that the user should configure the vocabulary like this:
# V1 vocabulary configuration.
data:
source_words_vocabulary: src_vocab.txt
This is no longer the case in V2 where vocabulary configuration now follows a general pattern, the same that is currently used for “embedding” and “tokenization” configurations.
Single vocabulary (e.g. language model):
data:
vocabulary: vocab.txt
Source and target vocabularies (e.g. sequence to sequence, tagging, etc.):
data:
source_vocabulary: src_vocab.txt
target_vocabulary: tgt_vocab.txt
Multi-source and target vocabularies:
data:
source_1_vocabulary: src_1_vocab.txt
source_2_vocabulary: src_2_vocab.txt
Nested inputs:
data:
source_1_1_vocabulary: src_1_1_vocab.txt
source_1_2_vocabulary: src_1_2_vocab.txt
source_2_vocabulary: src_2_vocab.txt
Changed predefined models
Predefined models do not require a model definition file and can be directly set to the --model_type
command line argument. Some of them have been changed for clarity:
V1 |
V2 |
Comment |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
Not considered useful compared to the standard Transformer |
|
|
Use |
|
|
Use |
Changed parameters
Some parameters in the YAML configuration have been renamed or removed:
V1 |
V2 |
Comment |
---|---|---|
|
|
|
|
Automatic value |
|
|
Automatic value |
|
|
|
Use steps instead of seconds to set the evaluation frequency |
|
Not implemented |
|
|
Set |
|
|
|
Use layer names instead of variable regexps |
|
Use |
|
|
Not implemented |
|
|
Dynamic loss scaling by default |
|
|
|
|
|
Not implemented |
|
|
Not implemented |
|
|
|
|
|
Not implemented |
|
|
|
Parameters taking reference to Python classes should also be revised when upgrading to V2, as the class likely changed in the process. This concerns:
params/optimizer
andparams/optimizer_params
params/decay_type
andparams/decay_params
New mixed precision workflow
OpenNMT-tf 2.0 is using the newly introduced graph rewriter to automatically convert parts of the graph from float32
to float16
.
Variables are casted on the fly and checkpoints no longer need to be converted for inference or to continue training in float32
. This means mixed precision is no longer a property of the model but should be enabled on the command line instead, e.g.:
onmt-main --model_type Transformer --config data.yml --auto_config --mixed_precision train