TransformerSpec

class ctranslate2.specs.TransformerSpec

Describes a Transformer model.

The specification is invariant to hidden dimensions but requires to explicitly set the number of layers and attention heads.

Inherits from: ctranslate2.specs.SequenceToSequenceModelSpec

Attributes:

Methods:

__init__(encoder: TransformerEncoderSpec, decoder: TransformerDecoderSpec)

Initializes a Transformer model specification.

Parameters
  • encoder – The encoder specification.

  • decoder – The decoder specification.

classmethod from_config(num_layers: Union[int, Tuple[int, int]], num_heads: int, with_relative_position: bool = False, pre_norm: bool = True, no_final_norm: bool = False, activation: Activation = Activation.RELU, alignment_layer: int = - 1, alignment_heads: int = 1, num_source_embeddings: int = 1, embeddings_merge: EmbeddingsMerge = EmbeddingsMerge.CONCAT, layernorm_embedding: bool = False, relative_attention_bias: bool = False, ffn_glu: bool = False, rms_norm: bool = False, multi_query_attention: bool = False)

Creates a Transformer model specification.

Parameters
  • num_layers – Number of encoder and decoder layers, or a 2-tuple if the number is different.

  • num_heads – Number of attention heads.

  • with_relative_position – Use relative position representations in the self-attention layers as described in https://arxiv.org/abs/1803.02155.

  • pre_norm – Enable the pre-norm Transformer architecture.

  • no_final_norm – Disable the final layer norm in the pre-norm architecture.

  • activation – Activation to apply in the feed-forward network.

  • alignment_layer – Layer index selected for alignment.

  • alignment_heads – Number of attention heads selected for alignment.

  • num_source_embeddings – Number of source embeddings.

  • embeddings_merge – When num_source_embeddings > 1, specify how the embeddings are merged.

  • layernorm_embedding – Apply layer normalization after the embedding layer.

  • relative_attention_bias – Use relative attention bias in the self-attention layers as described in the T5 paper https://arxiv.org/abs/1910.10683.

  • ffn_glu – Use gated linear units in the FFN layer as described in https://arxiv.org/abs/2002.05202.

  • rms_norm – Use the root mean square layer normalization.

  • multi_query_attention – Use multi-query attention.

get_default_config()

Returns the default configuration used by this model.

get_source_vocabulary_size()

Returns the source vocabulary size expected by the model.

get_target_vocabulary_size()

Returns the target vocabulary size expected by the model.

optimize(quantization: Optional[str] = None) None

Recursively applies some optimizations to this layer:

  • Alias variables with the same shape and value.

  • Quantize weights.

Parameters

quantization – Weight quantization scheme (possible values are: int8, int8_float32, int8_float16, int8_bfloat16, int16, float16, bfloat16, float32).

register_file(path: str, filename: Optional[str] = None) None

Registers a file to be saved in the model directory.

register_source_vocabulary(tokens: List[str]) None

Registers a source vocabulary of tokens.

Parameters

tokens – List of source tokens.

register_target_vocabulary(tokens: List[str]) None

Registers a target vocabulary of tokens.

Parameters

tokens – List of target tokens.

register_vocabulary_mapping(path: str) None

Registers a vocabulary mapping file.

Parameters

path – Path to the vocabulary mapping file.

save(output_dir: str) None

Saves this model on disk.

Parameters

output_dir – Output directory where the model is saved.

validate() None

Verify that the required weights are set.

Raises

ValueError – If a required weight is not set in the specification.

variables(prefix: str = '', ordered: bool = False) Dict[str, ndarray]

Recursively returns the weights from this layer and its children.

Parameters
  • prefix – Prefix to prepend to all variable names.

  • ordered – If set, an ordered list is returned instead.

Returns

Dictionary mapping variables name to value.

property config

The model configuration.

property name

The name of the model specification.

property revision

The model specification revision.

This value is incremented each time the weights layout of the model is changed (e.g. a weight is renamed).