Forum
GitHub

OpenNMT

An open source neural machine translation system.

OpenNMT-py models

This page lists pretrained models for OpenNMT-py.

Translation
Summarization
- English
- Chinese
Dialog

Translation

	New! NLLB 200 3.3B - Transformer (download)
	New! NLLB 200 1.3B - Transformer (download)
	New! NLLB 200 1.3B distilled - Transformer (download)
	New! NLLB 200 600M - Transformer (download)
Configuration	Yaml file example to run inference inference config Please change the source and terget languages in the yaml
Sentence Piece model	SP Model
Results	cf Forum

	New! v3 English-German - Transformer Large (download)
BPE Model	BPE ‘{“mode”: “aggressive”, “joiner_annotate”: True, “preserve_placeholders”: True, “case_markup”: True, “soft_case_regions”: True, “preserve_segmented_tokens”: True, “segment_case”: True, “segment_numbers”: True, “segment_alphabet_change”: True}’
BLEU	newstest2014 = 31.2 newstest2016 = 40.7 newstest2017 = 32.9 newstest2018 = 49.1 newstest2019 = 45.9

	English-German - v2 format model Transformer (download)
Configuration	Base Transformer configuration with standard training options
Data	WMT with shared SentencePiece model Original Paper replication
BLEU	newstest2014 = 26.89 newstest2017 = 28.09

	German-English - 2-layer BiLSTM (download)
Configuration	2-layer BiLSTM with hidden size 500 trained for 20 epochs
Data	IWSLT ‘14 DE-EN
BLEU	30.33

Summarization

English

	2-layer LSTM (download)
Configuration	2-layer LSTM with hidden size 500 trained for 20 epochs
Data	Gigaword standard
Gigaword F-Score	R1 = 33.60 R2 = 16.29 RL = 31.45

	2-layer LSTM with copy attention (download)
Configuration	2-layer LSTM with hidden size 500 and copy attention trained for 20 epochs
Data	Gigaword standard
Gigaword F-Score	R1 = 35.51 R2 = 17.35 RL = 33.17

	Transformer (download)
Configuration	See OpenNMT-py summarization example
Data	CNN/Daily Mail

	1-layer BiLSTM (download)
Configuration	See OpenNMT-py summarization example
Data	CNN/Daily Mail
Gigaword F-Score	R1 = 39.12 R2 = 17.35 RL = 36.12

Chinese

	1-layer BiLSTM (download)
Author	playma
Configuration	Preprocessing options: src_vocab_size 8000, tgt_vocab_size 8000, src_seq_length 400, tgt_seq_length 30, src_seq_length_trunc 400, tgt_seq_length_trunc 100. Training options: 1 layer, LSTM 300, WE 500, encoder_type brnn, input feed, AdaGrad, adagrad_accumulator_init 0.1, learning_rate 0.15, 30 epochs
Data	LCSTS
Gigaword F-Score	R1 = 35.67 R2 = 23.06 RL = 33.14

Dialog

	2-layer LSTM (download)
Configuration	2 layers, LSTM 500, WE 500, input feed, dropout 0.2, global_attention mlp, start_decay_at 7, 13 epochs
Data	OpenSubtitles