You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I followed instructions described here https://github.com/ollama/ollama/blob/main/docs/import.md.
Converted this model using options "--ctx 8192 --outtype f16 --vocab-type bpe" and quantized the result with option "q4_0". Both ended successfully.
But when using ollama to run the result, I got "Error: llama runner process no longer running: -1".
Is there anybody who successfully imported and run it?
Best regards, Bill
The text was updated successfully, but these errors were encountered:
Hi @Bill-XU. Sorry you hit an error. May I ask which model and/or model architecture you tried converting? Logs with a more specific error should be available here
Hi @Bill-XU. Sorry you hit an error. May I ask which model and/or model architecture you tried converting? Logs with a more specific error should be available here
Here are the details.
=== My server spec
OS: Ubuntu 22.04 LTS
Hard: 2 cpus / 8 GB memory (no GPU)
=== Ollama usage
Ollama service installed
(+ Open WebUI on Docker)
=== Steps of importing llama-3-8b-web
1. Downloaded assets from WebLlama on huggingface.co (Github: https://github.com/McGill-NLP/webllama)
2. Followed instructions of Importing (PyTorch & Safetensors)
3. During "Convert the model", used command "python llm/llama.cpp/convert.py /var/lib/custom_models/llama-3-8b-web --outtype f16 --outfile llama-3-8b-web.bin --ctx 8192 --vocab-type bpe"
4. and used "llm/llama.cpp/quantize llama-3-8b-web.bin llama-3-8b-web-quantized.bin q4_0" to quantize the model
5. Both step 3 and 4 succeeded
6. Used "ollama create llama-3-8b-web -f llama-3-8b-web.modelfile" to create the model, no error
7. When executing "ollama run llama-3-8b-web", it says "Error: llama runner process no longer running: -1"
=== Logs ollama.log
It seems that the result of conversion was incorrect or broken?
I followed instructions described here https://github.com/ollama/ollama/blob/main/docs/import.md.
Converted this model using options "--ctx 8192 --outtype f16 --vocab-type bpe" and quantized the result with option "q4_0". Both ended successfully.
But when using ollama to run the result, I got "Error: llama runner process no longer running: -1".
Is there anybody who successfully imported and run it?
Best regards, Bill
The text was updated successfully, but these errors were encountered: