In-graph SentencePiece tokenizer using
tensorflow_text.SentencepieceTokenizer
.
Inherits from: opennmt.tokenizers.tokenizer.TensorFlowTokenizer
-
__init__(model, nbest_size=0, alpha=1.0)[source]
Initializes the tokenizer.
- Parameters
model – Path to the SentencePiece model.
nbest_size – Number of candidates to sample from (disabled during inference).
alpha – Smoothing parameter for the sampling.