Skip to content

Releases: ggerganov/llama.cpp

b3070

02 Jun 22:36
3413ae2
Compare
Choose a tag to compare
fix bug introduced in using calloc (#7701)

compilade pointed this out on the previous MR

b3067

02 Jun 09:59
9422c5e
Compare
Choose a tag to compare
[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)

* Update rpc-server.cpp to include SYCL backend

Draft PR to address inclusion of SYCL backend for RPC server

* Update rpc-server.cpp

b3066

01 Jun 22:37
e141ce6
Compare
Choose a tag to compare
Fix FlashAttention debug test, FP32 assert (#7684)

b3065

01 Jun 20:14
2e66683
Compare
Choose a tag to compare
server : new UI (#7633)

* ic

* migrate my eary work

* add the belonging stuff: css,favicon etc

* de prompts

* chore: Update HTML meta tags in index.html file

* add api-key css classes

* some necessary fixes

* Add API key CSS classes and update styling in style.css

* clean the code

* move API to the top, rearrange param sliders. update css

* add tooltips to the parameters with comprehensible explanations

* fix FloatField and BoolField tooltips

* fix grammar field width

* use template literales for promptFormats.js

* update const ModelGenerationInfo

* remove ms per token, since not relevant for most webui users and use cases

* add phi-3 prompt template

* add phi3 to dropdown

* add css class

* update forgotten css theme

* add user message suffix

* fix chatml & add llama3 format

* fix llama3 prompt template

* more prompt format fixes

* add more comon stop tokens

* add missing char

* do not separate with new line or comma

* move prompt style

* add hacky llama2 prompt solution, reduce redundancy in promptFormats.js

* fix toggle state localstorage

* add cmd-r prompt et reduce redundancy

* set default prompt to empty

* move files, clean code

* fix css path

* add a button to the new ui

* move new ui to "/public" due to otherwise problematic CORS behaviour

* include new ui in cpp

* fix wrong link to old ui

* renaming to ensure consistency

* fix typos "prompt-format" -> "prompt-formats"

* use correct indent

* add new ui files to makefile

* fix typo

b3063

01 Jun 14:48
750f60c
Compare
Choose a tag to compare
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)

b3058

31 May 16:12
30e238b
Compare
Choose a tag to compare
Improve HIP compatibility (#7672)

b3056

31 May 12:19
0c27e6f
Compare
Choose a tag to compare
ggml : fix loongson compile warnings (#7537)

* ggml : fix loongson compile warnings

ggml-ci

* Fix loongarch quantize test fail.

Fix unexpected error introduced during rebase code.

* tests : disable json test due to lack of python on the CI node

ggml-ci

---------

Co-authored-by: junchao-loongson <zhaojunchao@loongson.cn>

b3051

30 May 22:43
5921b8f
Compare
Choose a tag to compare
llama : cache llama_token_to_piece (#7587)

* llama : cache llama_token_to_piece

ggml-ci

* llama : use vectors and avoid has_cache

ggml-ci

* llama : throw on unknown tokenizer types

ggml-ci

* llama : print a log of the total cache size

b3046

30 May 14:39
9c4c9cc
Compare
Choose a tag to compare
Move convert.py to examples/convert-legacy-llama.py (#7430)

* Move convert.py to examples/convert-no-torch.py

* Fix CI, scripts, readme files

* convert-no-torch -> convert-legacy-llama

* Move vocab thing to vocab.py

* Fix convert-no-torch -> convert-legacy-llama

* Fix lost convert.py in ci/run.sh

* Fix imports

* Fix gguf not imported correctly

* Fix flake8 complaints

* Fix check-requirements.sh

* Get rid of ADDED_TOKENS_FILE, FAST_TOKENIZER_FILE

* Review fixes

b3045

30 May 14:23
59b0d07
Compare
Choose a tag to compare
faster avx512 exp implementation (#7551)

* faster avx512 exp implementation

* x->r

* improve accuracy, handle special cases

* remove `e`