Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there anybody who successfully imported llama-3-8b-web? #4489

Open
Bill-XU opened this issue May 17, 2024 · 2 comments
Open

Is there anybody who successfully imported llama-3-8b-web? #4489

Bill-XU opened this issue May 17, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Bill-XU
Copy link

Bill-XU commented May 17, 2024

I followed instructions described here https://github.com/ollama/ollama/blob/main/docs/import.md.
Converted this model using options "--ctx 8192 --outtype f16 --vocab-type bpe" and quantized the result with option "q4_0". Both ended successfully.
But when using ollama to run the result, I got "Error: llama runner process no longer running: -1".

Is there anybody who successfully imported and run it?

Best regards, Bill

@jmorganca
Copy link
Member

Hi @Bill-XU. Sorry you hit an error. May I ask which model and/or model architecture you tried converting? Logs with a more specific error should be available here

@jmorganca jmorganca added the bug Something isn't working label May 17, 2024
@Bill-XU
Copy link
Author

Bill-XU commented May 17, 2024

Hi @Bill-XU. Sorry you hit an error. May I ask which model and/or model architecture you tried converting? Logs with a more specific error should be available here

Hi @jmorganca

Here are the details.
=== My server spec
OS: Ubuntu 22.04 LTS
Hard: 2 cpus / 8 GB memory (no GPU)
=== Ollama usage
Ollama service installed
(+ Open WebUI on Docker)
=== Steps of importing llama-3-8b-web
1. Downloaded assets from WebLlama on huggingface.co (Github: https://github.com/McGill-NLP/webllama)
2. Followed instructions of Importing (PyTorch & Safetensors)
3. During "Convert the model", used command "python llm/llama.cpp/convert.py /var/lib/custom_models/llama-3-8b-web --outtype f16 --outfile llama-3-8b-web.bin --ctx 8192 --vocab-type bpe"
4. and used "llm/llama.cpp/quantize llama-3-8b-web.bin llama-3-8b-web-quantized.bin q4_0" to quantize the model
5. Both step 3 and 4 succeeded
6. Used "ollama create llama-3-8b-web -f llama-3-8b-web.modelfile" to create the model, no error
7. When executing "ollama run llama-3-8b-web", it says "Error: llama runner process no longer running: -1"

=== Logs
ollama.log
It seems that the result of conversion was incorrect or broken?

Best regards,
Bill

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants