Skip to content

Releases: ggerganov/llama.cpp

b2928

18 May 16:58
059031b
Compare
Choose a tag to compare
ci : re-enable sanitizer runs (#7358)

* Revert "ci : temporary disable sanitizer builds (#6128)"

This reverts commit 4f6d1337ca5a409dc74aca8c479b7c34408a69c0.

* ci : trigger

b2927

18 May 15:14
511182e
Compare
Choose a tag to compare
android : use "ci-android" branch for CI (#7341)

* android : use "ci-android" branch for CI

* ggml : disable SIMD exp and silu for 32-bit ARM

ggml-ci

* android : do not fetch, use add_subdirectory instead

* cmake : provide binary dir

b2926

18 May 15:08
133d99c
Compare
Choose a tag to compare
CUDA: deduplicate FlashAttention code (#7352)

b2923

18 May 14:59
0f98acf
Compare
Choose a tag to compare
llama : add support for larger Granite Code Models (20B, 34B) (#7324)

Tie the weights for ARCH_STARCODER to support the larger Granite code models.
Partially addresses ggerganov/issues/7116

There still remains to be a few things to fix.
Currently requires `--override-kv tokenizer.ggml.add_bos_token=bool:false`

b2922

18 May 13:54
ca57e0f
Compare
Choose a tag to compare
perplexity : ndot progress and show stats with < 100 tasks (#7348)

Fix floating point error with ndot printing, allow end stats on lower task numbers if multiple-choice tasks.

b2921

18 May 08:42
c1b295e
Compare
Choose a tag to compare
Update and fix Vulkan soft_max and argsort implementations (#7237)

* Update and fix Vulkan softmax implementation

* Update and fix Vulkan argsort implementation

b2918

18 May 02:15
0583484
Compare
Choose a tag to compare
ggml : fix quants nans when all the group weights are very close to z…

…ero (#7313)

b2917

18 May 02:15
ef277de
Compare
Choose a tag to compare
cmake : fix typo in AMDGPU_TARGETS (#7356)

b2916

18 May 00:20
b43272a
Compare
Choose a tag to compare
Unicode codepoint flags for custom regexs (#7245)

* Replace CODEPOINT_TYPE_* with codepoint_flags
* Update and bugfix brute force random test
* Deterministic brute force random test
* Unicode normalization NFD
* Get rid of BOM

b2915

17 May 17:55
0fc1e82
Compare
Choose a tag to compare
CUDA: faster large batch FA without tensor cores (#7314)