# opennmt.models.sequence_tagger module¶

Sequence tagger.

class opennmt.models.sequence_tagger.SequenceTagger(inputter, encoder, labels_vocabulary_file_key, tagging_scheme=None, crf_decoding=False, daisy_chain_variables=False, name='seqtagger')[source]

A sequence tagger.

__init__(inputter, encoder, labels_vocabulary_file_key, tagging_scheme=None, crf_decoding=False, daisy_chain_variables=False, name='seqtagger')[source]

Initializes a sequence tagger.

Parameters: inputter – A opennmt.inputters.inputter.Inputter to process the input data. encoder – A opennmt.encoders.encoder.Encoder to encode the input. labels_vocabulary_file_key – The data configuration key of the labels vocabulary file containing one label per line. tagging_scheme – The tagging scheme used. For supported schemes (currently only BIOES), additional evaluation metrics could be computed such as precision, recall, etc. crf_decoding – If True, add a CRF layer after the encoder. daisy_chain_variables – If True, copy variables in a daisy chain between devices for this model. Not compatible with RNN based models. name – The name of this model.
initialize(metadata)[source]

Initializes the model from the data configuration.

Parameters: metadata – A dictionary containing additional data configuration set by the user (e.g. vocabularies, tokenization, pretrained embeddings, etc.).
compute_loss(outputs, labels, training=True, params=None)[source]

Computes the loss.

Parameters: outputs – The model outputs (usually unscaled probabilities). labels – The dict of labels tf.Tensor. training – Compute training loss. params – A dictionary of hyperparameters. The loss or a tuple containing the computed loss and the loss to display.
compute_metrics(predictions, labels)[source]

Computes additional metrics on the predictions.

Parameters: predictions – The model predictions. labels – The dict of labels tf.Tensor. A dict of metrics. See the eval_metric_ops field of tf.estimator.EstimatorSpec.
print_prediction(prediction, params=None, stream=None)[source]

Prints the model prediction.

Parameters: prediction – The evaluated prediction. params – (optional) Dictionary of formatting parameters. stream – (optional) The stream to print to.
class opennmt.models.sequence_tagger.TagsInputter(vocabulary_file_key)[source]

make_features(element=None, features=None, training=None)[source]

Tokenizes raw text.

opennmt.models.sequence_tagger.flag_bioes_tags(gold, predicted, sequence_length=None)[source]

Flags chunk matches for the BIOES tagging scheme.

This function will produce the gold flags and the predicted flags. For each aligned gold flag g and predicted flag p:

• when g == p == True, the chunk has been correctly identified (true positive).
• when g == False and p == True, the chunk has been incorrectly identified (false positive).
• when g == True and p == False, the chunk has been missed (false negative).
• when g == p == False, the chunk has been correctly ignored (true negative).
Parameters: gold – The gold tags as a Numpy 2D string array. predicted – The predicted tags as a Numpy 2D string array. sequence_length – The length of each sequence as Numpy array. A tuple (gold_flags, predicted_flags).