Installation

Install with pip

Binary Python wheels are published on PyPI and can be directly installed with pip:

pip install ctranslate2

The Python wheels have the following requirements:

  • OS: Linux (x86-64, AArch64), macOS (x86-64, ARM64), Windows (x86-64)

  • Python version: >= 3.6

  • pip version: >= 19.3 to support manylinux2014 wheels

GPU support

The Linux and Windows Python wheels support GPU execution. Install CUDA 11.2 or above to use the GPU.

Install with Docker

Docker images can be downloaded from the repository opennmt/ctranslate2:

docker pull opennmt/ctranslate2:latest-ubuntu20.04-cuda11.2

The images include:

  • the C++ library installed in /opt/ctranslate2

  • the Python module installed in the Python system packages

  • the translation executable, which is the image entrypoint:

docker run --rm opennmt/ctranslate2:latest-ubuntu20.04-cuda11.2 --help

GPU support

The Docker image supports GPU execution. Install the NVIDIA Container Toolkit to use GPUs from Docker.

Install from sources

Download the source code

Clone the CTranslate2 Git repository and its submodules.

git clone --recursive https://github.com/OpenNMT/CTranslate2.git

Compile the C++ library

Compiling the library requires a compiler supporting C++17 and CMake 3.15 or greater.

mkdir build && cd build
cmake ..
make -j4
make install

By default, the library is compiled with the Intel MKL backend which should be installed separately. See the Build options to select or add another backend.

Compile the Python wrapper

Once the C++ library is installed, you can compile the Python wrapper which uses pybind11. This step requires the Python development libraries to be installed on the system.

cd python
pip install -r install_requirements.txt
python setup.py bdist_wheel
pip install dist/*.whl

Attention

If you installed the C++ library in a custom directory, you should configure additional environment variables:

  • When running setup.py, set CTRANSLATE2_ROOT to the CTranslate2 install directory.

  • When running your Python application, add the CTranslate2 library path to LD_LIBRARY_PATH.

Build options

The following options can be set with -DOPTION=VALUE during the CMake configuration:

CMake option

Values (default in bold)

Description

BUILD_CLI

OFF, ON

Compiles the command line clients

BUILD_TESTS

OFF, ON

Compiles the tests

CMAKE_CXX_FLAGS

compiler flags

Defines additional compiler flags

CMAKE_INSTALL_PREFIX

path

Defines the installation path of the library

CUDA_ARCH_LIST

Auto

List of CUDA architectures to compile for (see cuda_select_nvcc_arch_flags in the CMake documentation)

CUDA_DYNAMIC_LOADING

OFF, ON

Enables the dynamic loading of CUDA libraries at runtime instead of linking against them (requires CUDA >= 11)

CUDA_NVCC_FLAGS

compiler flags

Defines additional compilation flags for nvcc

ENABLE_CPU_DISPATCH

OFF, ON

Compiles CPU kernels for multiple ISA and dispatches at runtime (should be disabled when explicitly targeting an architecture with the -march compilation flag)

ENABLE_PROFILING

OFF, ON

Enables the integrated profiler (usually disabled in production builds)

OPENMP_RUNTIME

INTEL, COMP, NONE

Selects or disables the OpenMP runtime:

  • INTEL: Intel OpenMP
  • COMP: OpenMP runtime provided by the compiler
  • NONE: no OpenMP runtime

WITH_CUDA

OFF, ON

Compiles with the CUDA backend

WITH_DNNL

OFF, ON

Compiles with the oneDNN backend (a.k.a. DNNL)

WITH_MKL

OFF, ON

Compiles with the Intel MKL backend

WITH_ACCELERATE

OFF, ON

Compiles with the Apple Accelerate backend

WITH_OPENBLAS

OFF, ON

Compiles with the OpenBLAS backend

WITH_RUY

OFF, ON

Compiles with the Ruy backend

Some build options require additional dependencies. See their respective documentation for installation instructions.

  • -DWITH_CUDA=ON requires CUDA >= 10.0

  • -DWITH_MKL=ON requires Intel MKL >= 2019.5

  • -DWITH_DNNL=ON requires oneDNN >= 1.5

  • -DWITH_ACCELERATE=ON requires Accelerate

  • -DWITH_OPENBLAS=ON requires OpenBLAS

Multiple backends can be enabled for a single build, for example:

  • -DWITH_MKL=ON -DWITH_CUDA=ON: enable CPU and GPU support

  • -DWITH_MKL=ON -DWITH_DNNL=ON: during runtime, the library will select Intel MKL when running on Intel and oneDNN when running on AMD

  • -DWITH_OPENBLAS=ON -DWITH_RUY=ON: use Ruy for quantized models and OpenBLAS for non quantized models