Skip to content

Pull requests: ggerganov/llama.cpp

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

cuda : clear error after buffer allocation failure
#7376 opened May 18, 2024 by slaren Loading…
Tokenizer SPM fixes for phi-3 and llama-spm python python script changes testing Everything test related
#7375 opened May 18, 2024 by jaime-m-p Loading…
ggml: implement quantized KV cache for FA
#7372 opened May 18, 2024 by JohannesGaessler Loading…
grammars: early exit when no next_candidates to reject
#7370 opened May 18, 2024 by ochafik Loading…
labeler.yml: Use settings from ggerganov/llama.cpp [no ci] devops improvements to build systems and github actions review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7363 opened May 18, 2024 by mofosyne Loading…
Vulkan Embedding Fix bugfix fixes an issue or bug python python script changes review complexity : high Generally require indepth knowledge of LLMs or GPUs Vulkan Issues specific to the Vulkan backend
#7360 opened May 18, 2024 by 0cc4m Loading…
OpenELM support model Model specific python python script changes review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7359 opened May 18, 2024 by icecream95 Draft
examples: cache hf model when --model not provided enhancement New feature or request review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7353 opened May 17, 2024 by amirzia Loading…
SimpleChat: a simple and dumb web front end for testing /chat/completions and /completions end points and try chat enhancement New feature or request examples review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level server testing Everything test related
#7350 opened May 17, 2024 by hanishkvc Loading…
Add StableLM2 pre-tokenizer model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7349 opened May 17, 2024 by aahouzi Loading…
server: add test for token probs enhancement New feature or request examples python python script changes review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level server testing Everything test related
#7347 opened May 17, 2024 by JohannesGaessler Loading…
Another threadpool: Avoid creating hundreds of threads in GGML performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7342 opened May 17, 2024 by besnardjb Loading…
add Viking tokenizer support model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7329 opened May 16, 2024 by jonabur Loading…
Viking-7B tokenizer support model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7328 opened May 16, 2024 by akx Draft
Fixed painfully slow single process builds. build Compilation issues need feedback Testing and feedback with results are needed performance Speed related topics
#7326 opened May 16, 2024 by jboero Loading…
[SYCL] Update SYCL upscale operation generation quality Quality of model output review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#7321 opened May 16, 2024 by AidanBeltonS Loading…
sched : support async weight copy performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7315 opened May 15, 2024 by slaren Draft
Add phi-2 tokenizer model Model specific review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7300 opened May 15, 2024 by BramVanroy Loading…
avoid to get prompt in infill mode and embedding mode examples review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix server
#7286 opened May 14, 2024 by woodx9 Draft
common: free ctx_gguf when exiting llama_control_vector_load_one bugfix fixes an issue or bug review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7285 opened May 14, 2024 by stevegrubb Loading…
ggml-opencl, llama: using reserve() if count already known refactoring Refactoring review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7272 opened May 14, 2024 by GermanAizek Draft
common, ngram_cache: added const reference for std::pair<> and std::tuple<> more 16 bytes: refactoring Refactoring review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7270 opened May 14, 2024 by GermanAizek Draft
ggml, ngram-cache, log: added const and const ref for function params refactoring Refactoring review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7269 opened May 14, 2024 by GermanAizek Loading…
ProTip! Add no:assignee to see everything that’s not assigned.