# Model¶

OpenNMT-tf can be used to train several types of models thanks to a modular and extensible design. Here is a non exhaustive overview of supported models:

Maching translation

Speech recognition

Language modeling

Sequence tagging

and most ideas and modules coming from these papers can be reused for other models or tasks.

## Catalog¶

OpenNMT-tf comes with a set of standard models that are defined in the catalog. These models can be directly selected with the --model_type command line option, e.g.:

onmt-main --model_type Transformer [...]


You can also get the list of predefined models by running onmt-main -h.

## Custom models¶

If you don’t find the model you are looking for in the catalog, OpenNMT-tf can load custom model definitions from external Python files. They should include a callable model that returns a opennmt.models.Model instance. For example, the model definition below extends the Transformer model to enable embeddings sharing:

import opennmt

class MyCustomTransformer(opennmt.models.Transformer):
def __init__(self):
super().__init__(
source_inputter=opennmt.inputters.WordEmbedder(embedding_size=512),
target_inputter=opennmt.inputters.WordEmbedder(embedding_size=512),
num_layers=6,
num_units=512,
ffn_inner_dim=2048,
dropout=0.1,
attention_dropout=0.1,
ffn_dropout=0.1,
share_embeddings=opennmt.models.EmbeddingsSharingLevel.ALL,
)

# Here you can override any method from the Model class for a customized behavior.

model = MyCustomTransformer


The custom model file should then be selected with the --model command line option, e.g.:

onmt-main --model config/models/custom_model.py [...]


This approach offers a high level of modeling freedom without changing the core implementation. Additionally, some public modules are defined to contain other modules and can be used to design complex architectures:

For example, these container modules can be used to implement multi source inputs, multi modal training, mixed word/character embeddings, and arbitrarily complex encoder architectures (e.g. mixing convolution, RNN, self-attention, etc.).

Some examples are available in the directory config/models of the Git repository.