Installation

Install with pip

Binary Python wheels are published on PyPI and can be directly installed with pip:

pip install ctranslate2

The Python wheels have the following requirements:

  • OS: Linux (x86-64, AArch64), macOS (x86-64, ARM64), Windows (x86-64)

  • Python version: >= 3.7

  • pip version: >= 19.3 to support manylinux2014 wheels

GPU support

The Linux and Windows Python wheels support GPU execution. Install CUDA 12.x to use the GPU.

If you plan to run models with convolutional layers (e.g. for speech recognition), you should also install cuDNN 8 for CUDA 12.x.

Note

On Windows the Visual C++ runtime is required. It is installed in most of the systems, but if is not the case, you have to download it and install it.

Install with Docker

Docker images can be downloaded from the GitHub Container registry:

docker pull ghcr.io/opennmt/ctranslate2:latest-ubuntu20.04-cuda11.2

The images include:

  • the NVIDIA libraries cuBLAS and cuDNN to support GPU execution

  • the C++ library installed in /opt/ctranslate2

  • the Python module installed in the Python system packages

  • the translator executable, which is the image entrypoint:

docker run --rm ghcr.io/opennmt/ctranslate2:latest-ubuntu20.04-cuda11.2 --help

To update to the new version that supports CUDA 12.

GPU support

The Docker image supports GPU execution. Install the NVIDIA Container Toolkit to use GPUs from Docker.

Install from sources

Download the source code

Clone the CTranslate2 Git repository and its submodules.

git clone --recursive https://github.com/OpenNMT/CTranslate2.git

Compile the C++ library

Compiling the library requires a compiler supporting C++17 and CMake 3.15 or greater.

mkdir build && cd build
cmake ..
make -j4
sudo make install
sudo ldconfig

By default, the library is compiled with the Intel MKL backend which should be installed separately. See the Build options to select or add another backend.

Compile the Python wrapper

Once the C++ library is installed, you can compile the Python wrapper which uses pybind11. This step requires the Python development libraries to be installed on the system.

cd python
pip install -r install_requirements.txt
python setup.py bdist_wheel
pip install dist/*.whl

Attention

If you installed the C++ library in a custom directory, you should configure additional environment variables:

  • When running setup.py, set CTRANSLATE2_ROOT to the CTranslate2 install directory.

  • When running your Python application, add the CTranslate2 library path to LD_LIBRARY_PATH.

Build options

The following options can be set with -DOPTION=VALUE during the CMake configuration:

CMake option

Values (default in bold)

Description

BUILD_CLI

OFF, ON

Compiles the command line clients

BUILD_TESTS

OFF, ON

Compiles the tests

CMAKE_CXX_FLAGS

compiler flags

Defines additional compiler flags

CMAKE_INSTALL_PREFIX

path

Defines the installation path of the library

CUDA_ARCH_LIST

Auto

List of CUDA architectures to compile for (see cuda_select_nvcc_arch_flags in the CMake documentation)

CUDA_DYNAMIC_LOADING

OFF, ON

Enables the dynamic loading of CUDA libraries at runtime instead of linking against them (requires CUDA >= 11)

CUDA_NVCC_FLAGS

compiler flags

Defines additional compilation flags for nvcc

ENABLE_CPU_DISPATCH

OFF, ON

Compiles CPU kernels for multiple ISA and dispatches at runtime (should be disabled when explicitly targeting an architecture with the -march compilation flag)

ENABLE_PROFILING

OFF, ON

Enables the integrated profiler (usually disabled in production builds)

OPENMP_RUNTIME

INTEL, COMP, NONE

Selects the OpenMP runtime:

  • INTEL: Intel OpenMP
  • COMP: OpenMP runtime provided by the compiler
  • NONE: no OpenMP runtime (a custom threading implementation will be used)

WITH_CUDA

OFF, ON

Compiles with the CUDA backend

WITH_CUDNN

OFF, ON

Compiles with the cuDNN backend

WITH_DNNL

OFF, ON

Compiles with the oneDNN backend (a.k.a. DNNL)

WITH_MKL

OFF, ON

Compiles with the Intel MKL backend

WITH_ACCELERATE

OFF, ON

Compiles with the Apple Accelerate backend

WITH_OPENBLAS

OFF, ON

Compiles with the OpenBLAS backend

WITH_RUY

OFF, ON

Compiles with the Ruy backend

Some build options require additional dependencies. See their respective documentation for installation instructions.

  • -DWITH_CUDA=ON requires CUDA >= 11.0

  • -DWITH_CUDNN=ON requires cuDNN >= 8

  • -DWITH_MKL=ON requires Intel MKL >= 2019.5

  • -DWITH_DNNL=ON requires oneDNN >= 3.0

  • -DWITH_ACCELERATE=ON requires Accelerate

  • -DWITH_OPENBLAS=ON requires OpenBLAS

Multiple backends can be enabled for a single build, for example:

  • -DWITH_MKL=ON -DWITH_CUDA=ON: enable CPU and GPU support

  • -DWITH_MKL=ON -DWITH_DNNL=ON: during runtime, the library will select Intel MKL when running on Intel and oneDNN when running on AMD

  • -DWITH_OPENBLAS=ON -DWITH_RUY=ON: use Ruy for quantized models and OpenBLAS for non quantized models