Add minimal python client example for the server, streaming callback #7373

chrismrutherford · 2024-05-18T20:52:20Z

When integrating llama.cpp with web services I use the server for inference. I find the server is fast and efficient using this method as the client is more or less pass-through.

There doesn't seem to be a good set of python examples for the server, possibly because most people use the openai client library? I was using this, but found it difficult to pass llama.cpp specific parameters such as cache_prompt, also it required that it required the model parameter to be present, even though it is not used by the server.

Here is a minimal python client that supports streaming. There is a streaming callback function that receives the data, here you can do things like check for "." for building sentences and passing them to a voice server, minimizing latency.

scottmudge · 2024-05-19T15:14:44Z

Cool, was looking for something exactly like this.

shigins · 2024-05-19T21:45:24Z

Thank you for this, the streaming example helped me out.

teleprint-me · 2024-05-20T21:35:20Z

Looks good.

Add minimal python client example for the server, streaming callback

b51ae5e

github-actions bot added examples python python script changes server labels May 18, 2024

mofosyne added the Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level label May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add minimal python client example for the server, streaming callback #7373

Add minimal python client example for the server, streaming callback #7373

chrismrutherford commented May 18, 2024

scottmudge commented May 19, 2024

shigins commented May 19, 2024

teleprint-me commented May 20, 2024

Add minimal python client example for the server, streaming callback #7373

Are you sure you want to change the base?

Add minimal python client example for the server, streaming callback #7373

Conversation

chrismrutherford commented May 18, 2024

scottmudge commented May 19, 2024

shigins commented May 19, 2024

teleprint-me commented May 20, 2024