Releases · OpenNMT/CTranslate2

Removed: Flash Attention support in the Python package due to significant package size increase with minimal performance gain.
Note: Flash Attention remains supported in the C++ package with the WITH_FLASH_ATTN option.
Flash Attention may be re-added in the future if substantial improvements are made.

New features

Support Llama3 (#1751)
Support Gemma2 (#1772)
Add log probs for all tokens in vocab (#1755)
Grouped conv1d (#1749 + #1758)

Fixes and improvements

Fix pipeline (#1723 + #1747)
Some improvements in flash attention (#1732)
Fix crash when using return_alternative on CUDA (#1733)
Quantization AWQ GEMM + GEMV (#1727)

Assets 2

11 Jun 09:16

minhthuc2502

v4.3.1

59c7dda

CTranslate2 4.3.1

Note: Because of exceeding project's size on Pypi (> 20 GB), the release v4.3.0 was pushed unsuccessfully.

Fixes and improvements

Improve the compilation (#1706 and #1705)
Fix position bias in tensor parallel mode (#1714)

Assets 2

17 May 08:20

minhthuc2502

v4.3.0

173a0d1

CTranslate2 4.3.0

New features

Support phi-3 (8k and 128k) (#1700 and #1680)

Fixes and improvements

Fix regression Flash Attention (#1695)

Assets 2

24 Apr 10:04

minhthuc2502

v4.2.1

0527ef7

CTranslate2 4.2.1

Note: Because of the increasing of package's size (> 100 MB), the release v4.2.0 was pushed unsuccessfully.

New features

Support load/unload for generator/Whisper Attention (#1670)

Fixes and improvements

Fix Llama 3 (#1671)

Assets 2

10 Apr 11:41

minhthuc2502

v4.2.0

e491a51

CTranslate2 4.2.0

New features

Support Flash Attention (#1651)
Implementation of gemm for FLOAT32 compute type with RUY backend (#1598)
Conv1D quantization for only CPU (DNNL and CUDA backend is not supported) (#1601)

Fixes and improvements

Fix bug tensor parallel (#1643)
Use BestSampler when temperature is 0 (#1659)
Fix bug gemma (#1660)
Optimize loading/unloading time for Translator with cache (#1645)

Assets 2

12 Mar 08:59

minhthuc2502

v4.1.1

bfa0cb3

CTranslate2 4.1.1

Fixes and improvements

Fix classifiers in setup.py to push pypi package

Assets 2

Releases: OpenNMT/CTranslate2

CTranslate2 4.6.2

New features

Fixes and improvements

Uh oh!

CTranslate2 4.6.1

New features

Uh oh!

CTranslate2 4.6.0

New features

Fixes and improvements

Uh oh!

CTranslate2 4.5.0

New features

Fixes and improvements

Uh oh!

CTranslate2 4.4.0

New features

Fixes and improvements

Uh oh!

CTranslate2 4.3.1

Fixes and improvements

Uh oh!

CTranslate2 4.3.0

New features

Fixes and improvements

Uh oh!

CTranslate2 4.2.1

New features

Fixes and improvements

Uh oh!

CTranslate2 4.2.0

New features

Fixes and improvements

Uh oh!

CTranslate2 4.1.1

Fixes and improvements

Uh oh!