Pull requests: ggerganov/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Tokenizer SPM fixes for phi-3 and llama-spm
python
python script changes
testing
Everything test related
#7375
opened May 18, 2024 by
jaime-m-p
Loading…
fix inverted strcmp checking for
quantize --keep-split
examples
#7374
opened May 18, 2024 by
fredlas
Loading…
Add minimal python client example for the server, streaming callback
examples
python
python script changes
server
#7373
opened May 18, 2024 by
chrismrutherford
Loading…
grammars: early exit when no next_candidates to reject
#7370
opened May 18, 2024 by
ochafik
Loading…
labeler.yml: Use settings from ggerganov/llama.cpp [no ci]
devops
improvements to build systems and github actions
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7363
opened May 18, 2024 by
mofosyne
Loading…
Vulkan Embedding Fix
bugfix
fixes an issue or bug
python
python script changes
review complexity : high
Generally require indepth knowledge of LLMs or GPUs
Vulkan
Issues specific to the Vulkan backend
#7360
opened May 18, 2024 by
0cc4m
Loading…
OpenELM support
model
Model specific
python
python script changes
review complexity : high
Generally require indepth knowledge of LLMs or GPUs
#7359
opened May 18, 2024 by
icecream95
•
Draft
examples: cache hf model when --model not provided
enhancement
New feature or request
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7353
opened May 17, 2024 by
amirzia
Loading…
SimpleChat: a simple and dumb web front end for testing /chat/completions and /completions end points and try chat
enhancement
New feature or request
examples
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
testing
Everything test related
#7350
opened May 17, 2024 by
hanishkvc
Loading…
Add StableLM2 pre-tokenizer
model
Model specific
python
python script changes
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7349
opened May 17, 2024 by
aahouzi
Loading…
server: add test for token probs
enhancement
New feature or request
examples
python
python script changes
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
testing
Everything test related
#7347
opened May 17, 2024 by
JohannesGaessler
Loading…
Another threadpool: Avoid creating hundreds of threads in GGML
performance
Speed related topics
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7342
opened May 17, 2024 by
besnardjb
Loading…
add Viking tokenizer support
model
Model specific
python
python script changes
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7329
opened May 16, 2024 by
jonabur
Loading…
Viking-7B tokenizer support
model
Model specific
python
python script changes
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Fixed painfully slow single process builds.
build
Compilation issues
need feedback
Testing and feedback with results are needed
performance
Speed related topics
#7326
opened May 16, 2024 by
jboero
Loading…
[SYCL] Update SYCL upscale operation
generation quality
Quality of model output
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#7321
opened May 16, 2024 by
AidanBeltonS
Loading…
sched : support async weight copy
performance
Speed related topics
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
Add phi-2 tokenizer
model
Model specific
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7300
opened May 15, 2024 by
BramVanroy
Loading…
avoid to get prompt in infill mode and embedding mode
examples
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
server
common: free ctx_gguf when exiting llama_control_vector_load_one
bugfix
fixes an issue or bug
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7285
opened May 14, 2024 by
stevegrubb
Loading…
ggml-opencl, llama: using reserve() if count already known
refactoring
Refactoring
review complexity : high
Generally require indepth knowledge of LLMs or GPUs
#7272
opened May 14, 2024 by
GermanAizek
•
Draft
common, ngram_cache: added const reference for std::pair<> and std::tuple<> more 16 bytes:
refactoring
Refactoring
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7270
opened May 14, 2024 by
GermanAizek
•
Draft
ggml, ngram-cache, log: added const and const ref for function params
refactoring
Refactoring
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7269
opened May 14, 2024 by
GermanAizek
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.