Skip to content

Releases: ggerganov/llama.cpp

b1500

09 Nov 02:50
57ad015
Compare
Choose a tag to compare
server : add min_p param (#3877)

* Update server.cpp with min_p after it was introduced in https://github.com/ggerganov/llama.cpp/pull/3841

* Use spaces instead of tabs

* Update index.html.hpp after running deps.sh

* Fix test - fix line ending

b1499

08 Nov 12:41
875fb42
Compare
Choose a tag to compare
ggml-alloc : fix backend assignments of views (#3982)

b1497

07 Nov 17:45
413503d
Compare
Choose a tag to compare
make : do not add linker flags when compiling static llava lib (#3977)

b1496

07 Nov 08:28
e9c1cec
Compare
Choose a tag to compare
ggml : fix backward rope after YaRN (#3974)

* fix backward process of rope

rope backward process was broken after YaRN RoPE (#2268) implementation, due to missing changes in backward functions.

the code for the backward process is nearly identically to the forward process:
the only difference is the sign of the sin-values.

to avoid future regressions remove the near-duplicate backward functions and reuse the forward code:

for this a new function argument `bool forward` was added to `ggml_compute_forward_rope_f32` and `ggml_compute_forward_rope_f16`.
the sin-values will be negated when forward is false.

* fix finetune rope call to use correct default attn_factor of 1.0f

* remove unused `ggml_rope_xpos_back`

it is better to have only one `ggml_rope_back` function that accepts all rope parameters, so that `ggml_compute_backward` can propagate all parameters without having to switch between different rope_back variants.

* fix comments explaining the sinus sign in ggml_forward_rope

* add missing function arguments in declaration

* fix function argument type in declaration

b1495

07 Nov 08:15
54b4df8
Compare
Choose a tag to compare
Use params when loading models in llava-cli (#3976)

llava-cli was loading models with default params and ignoring settings
from the cli. This switches to a generic function to load the params
from the cli options.

b1494

07 Nov 07:09
46876d2
Compare
Choose a tag to compare
cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946)

* protyping the idea that supports running on CPU for a GGML_USE_CUBLAS=on build

* doc: add comments to ggml_cublas_loaded()

* fix defined(...)

b1493

06 Nov 21:57
381efbf
Compare
Choose a tag to compare
llava : expose as a shared library for downstream projects (#3613)

* wip llava python bindings compatibility

* add external llava API

* add base64 in-prompt image support

* wip refactor image loading

* refactor image load out of llava init

* cleanup

* further cleanup; move llava-cli into its own file and rename

* move base64.hpp into common/

* collapse clip and llava libraries

* move llava into its own subdir

* wip

* fix bug where base64 string was not removed from the prompt

* get libllava to output in the right place

* expose llava methods in libllama.dylib

* cleanup memory usage around clip_image_*

* cleanup and refactor *again*

* update headerdoc

* build with cmake, not tested (WIP)

* Editorconfig

* Editorconfig

* Build with make

* Build with make

* Fix cyclical depts on Windows

* attempt to fix build on Windows

* attempt to fix build on Windows

* Upd TODOs

* attempt to fix build on Windows+CUDA

* Revert changes in cmake

* Fix according to review comments

* Support building as a shared library

* address review comments

---------

Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>

b1492

05 Nov 19:33
2833a6f
Compare
Choose a tag to compare
ggml-cuda : fix f16 mul mat (#3961)

* ggml-cuda : fix f16 mul mat

ggml-ci

* silence common.cpp warning (bonus)

b1491

05 Nov 17:33
d9ccce2
Compare
Choose a tag to compare
Allow common process_escapes to handle \x sequences (#3928)

* Allow common process_escapes to handle \x sequences

* Fix edge case when second hex digit is NUL

b1489

05 Nov 15:31
132d25b
Compare
Choose a tag to compare
cuda : fix disabling device with --tensor-split 1,0 (#3951)

Co-authored-by: slaren <slarengh@gmail.com>