Inference
onmt-main infer -h
Example:
onmt-main --config data.yml --auto_config infer --features_file src-test.txt
Checkpoints averaging
The average_checkpoints run type can be used to average the parameters of several checkpoints, usually increasing the model performance. For example:
onmt-main \
--config config/my_config.yml --auto_config \
average_checkpoints \
--output_dir run/baseline-enfr/avg \
--max_count 5
will average the parameters of the 5 latest checkpoints from the model directory configured in config/my_config.yml and save a new checkpoint in the directory run/baseline-enfr/avg.
Then, execute the inference by setting the --checkpoint_path option, e.g.:
onmt-main \
--config config/my_config.yml --auto_config \
--checkpoint_path run/baseline-enfr/avg/ckpt-200000 \
infer --features_file newstest2014.en.tok --predictions_file newstest2014.en.tok.out
To control the saving of checkpoints during the training, configure the following options in your configuration file:
train:
# (optional) Save a checkpoint every this many steps.
save_checkpoints_steps: 5000
# (optional) How many checkpoints to keep on disk.
keep_checkpoint_max: 10
Random sampling
Sampling predictions from the output distribution can be an effective decoding strategy for back-translation, as described by Edunov et al. 2018. To enable this feature, you should configure the parameter sampling_topk. Possible values are:
k, sample from thekmost likely tokens0, sample from the full output distribution1, no sampling (default)
For example:
params:
beam_width: 1
sampling_topk: 0
sampling_temperature: 1
Noising
Noising the decoded output is also a possible decoding strategy for back-translation, as described in Edunov et al. 2018. 3 types of noise are currently implemented:
dropout: randomly drop words in the sequencereplacement: randomly replace words by a filler tokenpermutation: randomly permute words with a maximum distance
which can be combined in sequence, e.g.:
params:
decoding_subword_token: ▁
decoding_noise:
- dropout: 0.1
- replacement: [0.1, ⦅unk⦆]
- permutation: 3
The parameter decoding_subword_token (here set to the SentencePiece spacer) is useful to apply noise at the word level instead of the subword level.
N-best list
A n-best list can be generated for models using beam search. You can configure it in your configuration file:
infer:
n_best: 5
With this option, each input line will simply generate N consecutive lines in the output, ordered from best to worst.
Note that N can not be greater than the configured beam_width.
Scoring
The main OpenNMT-tf script can also be used to score existing translations via the score run type. It requires 2 command line options to be set:
--features_file, the input labels;--predictions_file, the translations to score.
e.g.:
onmt-main \
--config config/my_config.yml --auto_config \
score
--features_file newstest2014.en.tok \
--predictions_file newstest2014.en.tok.out
The command will write on the standard output the score generated for each line in the following format:
<score> ||| <translation>
where <score> is the negative log likelihood of the provided translation.
Tip: combining the n-best list generation and the scoring can be used for reranking translations.