Model
OpenNMT-tf can be used to train several types of models thanks to a modular and extensible design. Here is a non exhaustive overview of supported models:
Maching translation
Sequence to Sequence Learning with Neural Networks (Sutskever et al. 2014)
Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau et al. 2014)
Effective Approaches to Attention-based Neural Machine Translation (Luong et al. 2015)
Guided Alignment Training for Topic-Aware Neural Machine Translation (Chen et al. 2016)
Linguistic Input Features Improve Neural Machine Translation (Sennrich et al. 2016)
Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation (Wu et al. 2016)
A Convolutional Encoder Model for Neural Machine Translation (Gehring et al. 2016)
Attention Is All You Need (Vaswani et al. 2017)
MS-UEdin Submission to the WMT2018 APE Shared Task: Dual-Source Transformer for Automatic Post-Editing (Junczys-Dowmunt et al. 2018)
Scaling Neural Machine Translation (Ott et al. 2018)
The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation (Chen et al. 2018)
Self-Attention with Relative Position Representations (Shaw et al. 2018)
Speech recognition
Listen, Attend and Spell (Chan et al. 2015)
Language modeling
Language Models are Unsupervised Multitask Learners (Radford et al. 2019)
Sequence tagging
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF (Ma et al. 2016)
and most ideas and modules coming from these papers can be reused for other models or tasks.
Catalog
OpenNMT-tf comes with a set of standard models that are defined in the catalog. These models can be directly selected with the --model_type
command line option, e.g.:
onmt-main --model_type Transformer [...]
You can also get the list of predefined models by running onmt-main -h
.
Custom models
If you don’t find the model you are looking for in the catalog, OpenNMT-tf can load custom model definitions from external Python files. They should include a callable model
that returns a opennmt.models.Model
instance. For example, the model definition below extends the Transformer model to enable embeddings sharing:
import opennmt
class MyCustomTransformer(opennmt.models.Transformer):
def __init__(self):
super().__init__(
source_inputter=opennmt.inputters.WordEmbedder(embedding_size=512),
target_inputter=opennmt.inputters.WordEmbedder(embedding_size=512),
num_layers=6,
num_units=512,
num_heads=8,
ffn_inner_dim=2048,
dropout=0.1,
attention_dropout=0.1,
ffn_dropout=0.1,
share_embeddings=opennmt.models.EmbeddingsSharingLevel.ALL,
)
# Here you can override any method from the Model class for a customized behavior.
model = MyCustomTransformer
The custom model file should then be selected with the --model
command line option, e.g.:
onmt-main --model config/models/custom_model.py [...]
This approach offers a high level of modeling freedom without changing the core implementation. Additionally, some public modules are defined to contain other modules and can be used to design complex architectures:
For example, these container modules can be used to implement multi source inputs, multi modal training, mixed word/character embeddings, and arbitrarily complex encoder architectures (e.g. mixing convolution, RNN, self-attention, etc.).
Some examples are available in the directory config/models
of the Git repository.