OpenNMT

An open source neural machine translation system.

OpenNMT-py models

This page lists pretrained models for OpenNMT-py.

Translation

  New! NLLB 200 3.3B - Transformer (download)
  New! NLLB 200 1.3B - Transformer (download)
  New! NLLB 200 1.3B distilled - Transformer (download)
  New! NLLB 200 600M - Transformer (download)
Configuration Yaml file example to run inference inference config
Please change the source and terget languages in the yaml
Sentence Piece model SP Model
Results cf Forum
  New! v3 English-German - Transformer Large (download)
BPE Model BPE
‘{“mode”: “aggressive”, “joiner_annotate”: True, “preserve_placeholders”: True, “case_markup”: True, “soft_case_regions”: True, “preserve_segmented_tokens”: True, “segment_case”: True, “segment_numbers”: True, “segment_alphabet_change”: True}’
BLEU newstest2014 = 31.2
newstest2016 = 40.7
newstest2017 = 32.9
newstest2018 = 49.1
newstest2019 = 45.9
  English-German - v2 format model Transformer (download)
Configuration Base Transformer configuration with standard training options
Data WMT with shared SentencePiece model
Original Paper replication
BLEU newstest2014 = 26.89
newstest2017 = 28.09
  German-English - 2-layer BiLSTM (download)
Configuration 2-layer BiLSTM with hidden size 500 trained for 20 epochs
Data IWSLT ‘14 DE-EN
BLEU 30.33

Summarization

English

  2-layer LSTM (download)
Configuration 2-layer LSTM with hidden size 500 trained for 20 epochs
Data Gigaword standard
Gigaword F-Score R1 = 33.60
R2 = 16.29
RL = 31.45
  2-layer LSTM with copy attention (download)
Configuration 2-layer LSTM with hidden size 500 and copy attention trained for 20 epochs
Data Gigaword standard
Gigaword F-Score R1 = 35.51
R2 = 17.35
RL = 33.17
  Transformer (download)
Configuration See OpenNMT-py summarization example
Data CNN/Daily Mail
  1-layer BiLSTM (download)
Configuration See OpenNMT-py summarization example
Data CNN/Daily Mail
Gigaword F-Score R1 = 39.12
R2 = 17.35
RL = 36.12

Chinese

  1-layer BiLSTM (download)
Author playma
Configuration Preprocessing options: src_vocab_size 8000, tgt_vocab_size 8000, src_seq_length 400, tgt_seq_length 30, src_seq_length_trunc 400, tgt_seq_length_trunc 100.
Training options: 1 layer, LSTM 300, WE 500, encoder_type brnn, input feed, AdaGrad, adagrad_accumulator_init 0.1, learning_rate 0.15, 30 epochs
Data LCSTS
Gigaword F-Score R1 = 35.67
R2 = 23.06
RL = 33.14

Dialog

  2-layer LSTM (download)
Configuration 2 layers, LSTM 500, WE 500, input feed, dropout 0.2, global_attention mlp, start_decay_at 7, 13 epochs
Data OpenSubtitles