Skip to content

Releases: OpenNMT/CTranslate2

CTranslate2 4.6.2

05 Dec 08:32
1251f7c

Choose a tag to compare

New features

Fixes and improvements

CTranslate2 4.6.1

07 Nov 16:26
785c7e0

Choose a tag to compare

New features

  • Python 3.14 support (#1926)
  • Support for Cuda 12.4 (#1925)
  • Update Intel oneAPI to version 2025.3 (#1931)

CTranslate2 4.6.0

08 Apr 15:33
617405f

Choose a tag to compare

Note: The Ctranslate2 Python package now supports python 3.13, drop the support for python 3.8.

New features

  • Pyhton 3.13 support (#1858)
  • Support returning hidden vector in Wav2Vec2 and Wav2Vec2Bert Models (#1867)
  • Add noexecstack linker flags (#1852 + #1861)
  • Support Qwen2 (#1820)
  • Eoleconv (#1832)
  • Add support RobertModel (#1864)

Fixes and improvements

  • Fix github action (#1871)
  • Prevent double library def (#1818)

CTranslate2 4.5.0

22 Oct 11:23
383d063

Choose a tag to compare

Note: The Ctranslate2 Python package now supports CUDNN 9 and is no longer compatible with CUDNN 8.

New features

  • Support Phi3 (#1800)
  • Support Mistral Nemo (#1785)
  • Support Wav2Vec2Bert ASR (#1778)

Fixes and improvements

CTranslate2 4.4.0

09 Sep 09:21
8f4d134

Choose a tag to compare

Removed: Flash Attention support in the Python package due to significant package size increase with minimal performance gain.
Note: Flash Attention remains supported in the C++ package with the WITH_FLASH_ATTN option.
Flash Attention may be re-added in the future if substantial improvements are made.

New features

Fixes and improvements

  • Fix pipeline (#1723 + #1747)
  • Some improvements in flash attention (#1732)
  • Fix crash when using return_alternative on CUDA (#1733)
  • Quantization AWQ GEMM + GEMV (#1727)

CTranslate2 4.3.1

11 Jun 09:16
59c7dda

Choose a tag to compare

Note: Because of exceeding project's size on Pypi (> 20 GB), the release v4.3.0 was pushed unsuccessfully.

Fixes and improvements

  • Improve the compilation (#1706 and #1705)
  • Fix position bias in tensor parallel mode (#1714)

CTranslate2 4.3.0

17 May 08:20
173a0d1

Choose a tag to compare

New features

Fixes and improvements

  • Fix regression Flash Attention (#1695)

CTranslate2 4.2.1

24 Apr 10:04
0527ef7

Choose a tag to compare

Note: Because of the increasing of package's size (> 100 MB), the release v4.2.0 was pushed unsuccessfully.

New features

  • Support load/unload for generator/Whisper Attention (#1670)

Fixes and improvements

CTranslate2 4.2.0

10 Apr 11:41
e491a51

Choose a tag to compare

New features

  • Support Flash Attention (#1651)
  • Implementation of gemm for FLOAT32 compute type with RUY backend (#1598)
  • Conv1D quantization for only CPU (DNNL and CUDA backend is not supported) (#1601)

Fixes and improvements

  • Fix bug tensor parallel (#1643)
  • Use BestSampler when temperature is 0 (#1659)
  • Fix bug gemma (#1660)
  • Optimize loading/unloading time for Translator with cache (#1645)

CTranslate2 4.1.1

12 Mar 08:59
bfa0cb3

Choose a tag to compare

Fixes and improvements

  • Fix classifiers in setup.py to push pypi package