Paddler: open source stateful load balancer custom-tailored for llama.cpp #7369
mcharytoniuk
started this conversation in
Show and tell
Replies: 1 comment
-
Looks interesting. Thanks for sharing! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! : )
I finished a new project recently. I needed a load balancer specifically tailored for the llama.cpp that considers its specifics (slots usage, continuous batching). It also works in environments with auto-scaling (you can freely add and remove hosts)
Let me know what you think. Thank you all for creating and maintaining llama.cpp!
PS. I called it "paddler" because I wanted to use Raft protocol initially, but in the end, it was unnecessary. I kept the name, though. :)
Repo:
https://github.com/distantmagic/paddler
Beta Was this translation helpful? Give feedback.
All reactions