class onmt.translate.translation_server.ServerModel(opt, model_id, tokenizer_opt=None, load=False, timeout=-1, on_timeout='to_cpu', model_root='./')[source]

Bases: object

Wrap a model with server functionality.

  • opt (dict) – Options for the Translator
  • model_id (int) – Model ID
  • tokenizer_opt (dict) – Options for the tokenizer or None
  • load (bool) – whether to load the model during __init__()
  • timeout (int) – Seconds before running do_timeout() Negative values means no timeout
  • on_timeout (str) – Options are [“to_cpu”, “unload”]. Set what to do on timeout (see do_timeout().)
  • model_root (str) – Path to the model directory it must contain the model and tokenizer file

Detokenize a single sequence

Same args/returns as tokenize()


Timeout function that frees GPU memory.

Moves the model to CPU or unloads it; depending on attr`self.on_timemout` value


De-tokenize the sequence (or not)

Same args/returns as tokenize()


Tokenize the sequence (or not).

Same args/returns as tokenize


Parse the option set passed by the user using onmt.opts

Parameters:opt (dict) – Options passed by the user
Returns:full set of options for the Translator
Return type:opt (argparse.Namespace)

Move the model to GPU.


Tokenize a single sequence.

Parameters:sequence (str) – The sequence to tokenize.
Returns:The tokenized sequence.
Return type:tok (str)

Core Server

exception onmt.translate.translation_server.ServerModelError[source]

Bases: Exception

class onmt.translate.translation_server.Timer(start=False)[source]

Bases: object

class onmt.translate.translation_server.TranslationServer[source]

Bases: object

clone_model(model_id, opt, timeout=-1)[source]

Clone a model model_id.

Different options may be passed. If opt is None, it will use the same set of options


Return the list of available models

load_model(opt, model_id=None, **model_kwargs)[source]

Load a model given a set of options

preload_model(opt, model_id=None, **model_kwargs)[source]

Preloading the model: updating internal datastructure

It will effectively load the model if load is set


Translate inputs

We keep the same format as the Lua version i.e. [{"id": model_id, "src": "sequence to translate"},{ ...}]

We use inputs[0][“id”] as the model id


Read the config file and pre-/load the models.


Manually unload a model.

It will free the memory and cancel the timer