Releases · ggerganov/llama.cpp

09 Nov 02:50

57ad015

b1500

server : add min_p param (#3877)

* Update server.cpp with min_p after it was introduced in https://github.com/ggerganov/llama.cpp/pull/3841

* Use spaces instead of tabs

* Update index.html.hpp after running deps.sh

* Fix test - fix line ending

Assets 12

08 Nov 12:41

github-actions

b1499

875fb42

b1499

ggml-alloc : fix backend assignments of views (#3982)

Assets 12

07 Nov 17:45

github-actions

b1497

413503d

b1497

make : do not add linker flags when compiling static llava lib (#3977)

Assets 12

07 Nov 08:28

github-actions

b1496

e9c1cec

b1496

ggml : fix backward rope after YaRN (#3974)

* fix backward process of rope

rope backward process was broken after YaRN RoPE (#2268) implementation, due to missing changes in backward functions.

the code for the backward process is nearly identically to the forward process:
the only difference is the sign of the sin-values.

to avoid future regressions remove the near-duplicate backward functions and reuse the forward code:

for this a new function argument `bool forward` was added to `ggml_compute_forward_rope_f32` and `ggml_compute_forward_rope_f16`.
the sin-values will be negated when forward is false.

* fix finetune rope call to use correct default attn_factor of 1.0f

* remove unused `ggml_rope_xpos_back`

it is better to have only one `ggml_rope_back` function that accepts all rope parameters, so that `ggml_compute_backward` can propagate all parameters without having to switch between different rope_back variants.

* fix comments explaining the sinus sign in ggml_forward_rope

* add missing function arguments in declaration

* fix function argument type in declaration

Assets 12

07 Nov 08:15

github-actions

b1495

54b4df8

b1495

Use params when loading models in llava-cli (#3976)

llava-cli was loading models with default params and ignoring settings
from the cli. This switches to a generic function to load the params
from the cli options.

Assets 12

07 Nov 07:09

github-actions

b1494

46876d2

b1494

cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946)

* protyping the idea that supports running on CPU for a GGML_USE_CUBLAS=on build

* doc: add comments to ggml_cublas_loaded()

* fix defined(...)

Assets 12

06 Nov 21:57

github-actions

b1493

381efbf

b1493

llava : expose as a shared library for downstream projects (#3613)

* wip llava python bindings compatibility

* add external llava API

* add base64 in-prompt image support

* wip refactor image loading

* refactor image load out of llava init

* cleanup

* further cleanup; move llava-cli into its own file and rename

* move base64.hpp into common/

* collapse clip and llava libraries

* move llava into its own subdir

* wip

* fix bug where base64 string was not removed from the prompt

* get libllava to output in the right place

* expose llava methods in libllama.dylib

* cleanup memory usage around clip_image_*

* cleanup and refactor *again*

* update headerdoc

* build with cmake, not tested (WIP)

* Editorconfig

* Editorconfig

* Build with make

* Build with make

* Fix cyclical depts on Windows

* attempt to fix build on Windows

* attempt to fix build on Windows

* Upd TODOs

* attempt to fix build on Windows+CUDA

* Revert changes in cmake

* Fix according to review comments

* Support building as a shared library

* address review comments

---------

Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>

Assets 12

05 Nov 19:33

github-actions

b1492

2833a6f

b1492

ggml-cuda : fix f16 mul mat (#3961)

* ggml-cuda : fix f16 mul mat

ggml-ci

* silence common.cpp warning (bonus)

Assets 12

05 Nov 17:33

github-actions

b1491

d9ccce2

b1491

Allow common process_escapes to handle \x sequences (#3928)

* Allow common process_escapes to handle \x sequences

* Fix edge case when second hex digit is NUL

Assets 12

05 Nov 15:31

github-actions

b1489

132d25b

b1489

cuda : fix disabling device with --tensor-split 1,0 (#3951)

Co-authored-by: slaren <slarengh@gmail.com>

Assets 12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b1500

b1499

b1497

b1496

b1495

b1494

b1493

b1492

b1491

b1489