TransformerSpec

class ctranslate2.specs.TransformerSpec

Describes a Transformer model.

The specification is invariant to hidden dimensions but requires to explicitly set the number of layers and attention heads.

Inherits from: ctranslate2.specs.SequenceToSequenceModelSpec

Attributes:

config
name
revision

Methods:

from_config
get_default_config
get_source_vocabulary_size
get_target_vocabulary_size
optimize
register_file
register_source_vocabulary
register_target_vocabulary
register_vocabulary_mapping
save
validate
variables

__init__(encoder: TransformerEncoderSpec, decoder: TransformerDecoderSpec)

Initializes a Transformer model specification.

Parameters

encoder – The encoder specification.
decoder – The decoder specification.

classmethod from_config(num_layers: Union[int, Tuple[int, int]], num_heads: int, with_relative_position: bool = False, pre_norm: bool = True, no_final_norm: bool = False, activation: Activation = Activation.RELU, alignment_layer: int = -1, alignment_heads: int = 1, num_source_embeddings: int = 1, embeddings_merge: EmbeddingsMerge = EmbeddingsMerge.CONCAT, layernorm_embedding: bool = False, relative_attention_bias: bool = False, ffn_glu: bool = False, rms_norm: bool = False, multi_query_attention: bool = False)

Creates a Transformer model specification.

Parameters

num_layers – Number of encoder and decoder layers, or a 2-tuple if the number is different.
num_heads – Number of attention heads.
with_relative_position – Use relative position representations in the self-attention layers as described in https://arxiv.org/abs/1803.02155.
pre_norm – Enable the pre-norm Transformer architecture.
no_final_norm – Disable the final layer norm in the pre-norm architecture.
activation – Activation to apply in the feed-forward network.
alignment_layer – Layer index selected for alignment.
alignment_heads – Number of attention heads selected for alignment.
num_source_embeddings – Number of source embeddings.
embeddings_merge – When num_source_embeddings > 1, specify how the embeddings are merged.
layernorm_embedding – Apply layer normalization after the embedding layer.
relative_attention_bias – Use relative attention bias in the self-attention layers as described in the T5 paper https://arxiv.org/abs/1910.10683.
ffn_glu – Use gated linear units in the FFN layer as described in https://arxiv.org/abs/2002.05202.
rms_norm – Use the root mean square layer normalization.
multi_query_attention – Use multi-query attention.

get_default_config(): Returns the default configuration used by this model.

get_source_vocabulary_size(): Returns the source vocabulary size expected by the model.

get_target_vocabulary_size(): Returns the target vocabulary size expected by the model.

optimize(quantization: Optional[str] = None) → None

Recursively applies some optimizations to this layer:

Alias variables with the same shape and value.
Quantize weights.

Parameters: quantization – Weight quantization scheme (possible values are: int8, int8_float32, int8_float16, int8_bfloat16, int16, float16, bfloat16, float32).

register_file(path: str, filename: Optional[str] = None) → None: Registers a file to be saved in the model directory.

register_source_vocabulary(tokens: List[str]) → None

Registers a source vocabulary of tokens.

Parameters: tokens – List of source tokens.

register_target_vocabulary(tokens: List[str]) → None

Registers a target vocabulary of tokens.

Parameters: tokens – List of target tokens.

register_vocabulary_mapping(path: str) → None

Registers a vocabulary mapping file.

Parameters: path – Path to the vocabulary mapping file.

save(output_dir: str) → None

Saves this model on disk.

Parameters: output_dir – Output directory where the model is saved.

validate() → None

Verify that the required weights are set.

Raises: ValueError – If a required weight is not set in the specification.

variables(prefix: str = '', ordered: bool = False) → Dict[str, ndarray]

Recursively returns the weights from this layer and its children.

Parameters

prefix – Prefix to prepend to all variable names.
ordered – If set, an ordered list is returned instead.

Returns

Dictionary mapping variables name to value.

property config: The model configuration.

property name: The name of the model specification.

property revision

The model specification revision.

This value is incremented each time the weights layout of the model is changed (e.g. a weight is renamed).