Doc: Translation

Translations

class onmt.translate.Translation(src, src_raw, pred_sents, attn, pred_scores, tgt_sent, gold_score)[source]

Container for a translated sentence.

src

LongTensor – src word ids

src_raw

[str] – raw src words

pred_sents

[[str]] – words from the n-best translations

pred_scores

[[float]] – log-probs of n-best translations

attns

[FloatTensor] – attention dist for each translation

gold_sent

[str] – words from gold translation

gold_score

[float] – log-prob of gold translation

log(sent_number)[source]

Log translation.

Translator Class

class onmt.translate.Translator(model, fields, beam_size, n_best=1, max_length=100, global_scorer=None, copy_attn=False, logger=None, gpu=False, dump_beam='', min_length=0, stepwise_penalty=False, block_ngram_repeat=0, ignore_when_blocking=[], sample_rate='16000', window_size=0.02, window_stride=0.01, window='hamming', use_filter_pred=False, data_type='text', replace_unk=False, report_score=True, report_bleu=False, report_rouge=False, verbose=False, out_file=None)[source]

Uses a model to translate a batch of sentences.

Parameters:
  • model (onmt.modules.NMTModel) – NMT model to use for translation
  • fields (dict of Fields) – data fields
  • beam_size (int) – size of beam to use
  • n_best (int) – number of translations produced
  • max_length (int) – maximum length output to produce
  • global_scores (GlobalScorer) – object to rescore final translations
  • copy_attn (bool) – use copy attention during translation
  • cuda (bool) – use cuda
  • beam_trace (bool) – trace beam search for debugging
  • logger (logging.Logger) – logger.
translate_batch(batch, data)[source]

Translate a batch of sentences.

Mostly a wrapper around Beam.

Parameters:
  • batch (Batch) – a batch from a dataset object
  • data (Dataset) – the dataset object
class onmt.translate.TranslationBuilder(data, fields, n_best=1, replace_unk=False, has_tgt=False)[source]

Build a word-based translation from the batch output of translator and the underlying dictionaries.

Replacement based on “Addressing the Rare Word Problem in Neural Machine Translation” [LSL+15]

Parameters:
  • data (DataSet) –
  • fields (dict of Fields) – data fields
  • n_best (int) – number of translations produced
  • replace_unk (bool) – replace unknown words using attention
  • has_tgt (bool) – will the batch have gold targets