Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama.cpp failing #371

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

llama.cpp failing #371

wants to merge 1 commit into from

Conversation

bet0x
Copy link

@bet0x bet0x commented Apr 22, 2024

llama.cpp is failing to generate quantize versions for the trained models.

Error:

You might have to compile llama.cpp yourself, then run this again.
You do not need to close this Python program. Run the following commands in a new terminal:
You must run this in the same folder as you're saving your model.
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make clean && LLAMA_CUDA=1 make all -j
Once that's done, redo the quantization.

But when i do clone this with recursive it works.

llama.cpp is failing to generate quantize versions for the trained models.

Error:

```bash
You might have to compile llama.cpp yourself, then run this again.
You do not need to close this Python program. Run the following commands in a new terminal:
You must run this in the same folder as you're saving your model.
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make clean && LLAMA_CUDA=1 make all -j
Once that's done, redo the quantization.
```

But when i do clone this with recursive it works.
@danielhanchen
Copy link
Contributor

Oh my I need to check this asap thanks for the heads up

@dynamite9999
Copy link

I have the same issue, but I followed the instructions to clone and make and still the same error.
However if I manually go into llama.cpp, it works partially on the untrained unmerged model

..INFO:hf-to-gguf:Model successfully exported to '../unsloth/llama-3-8b-bnb-4bit/ggml-model-f16.gguf'

Here is the error message using
model.save_pretrained_gguf(TRAINED_GGUF_MODEL, tokenizer, quantization_method = "q4_k_m")

which used to work a few days ago.

Unsloth: Converting llama model. Can use fast conversion = True.
==((====))== Unsloth: Conversion from QLoRA to GGUF information
\ /| [0] Installing llama.cpp will take 3 minutes.
O^O/ _/ \ [1] Converting HF to GUUF 16bits will take 3 minutes.
\ / [2] Converting GGUF 16bits to q4_k_m will take 20 minutes.
"-____-" In total, you will have to wait around 26 minutes.

Unsloth: [0] Installing llama.cpp. This will take 3 minutes...
Unsloth: [1] Converting model at unsloth/llama-3-8b-bnb-4bit into f16 GGUF format.
The output location will be ./unsloth/llama-3-8b-bnb-4bit-unsloth.F16.gguf
This will take 3 minutes...
Traceback (most recent call last):
File "/home/d/hp/NetAnalytics/dev/netai/syslog/syslog_scraper_netai/t59_nie_func_data/nie_trainer.v1.py", line 1264, in
main()
File "/home/d/hp/NetAnalytics/dev/netai/syslog/syslog_scraper_netai/t59_nie_func_data/nie_trainer.v1.py", line 1232, in main
model.save_pretrained_gguf(new_model, tokenizer, quantization_method = "q4_k_m")
File "/home/d/.local/lib/python3.11/site-packages/unsloth/save.py", line 1340, in unsloth_save_pretrained_gguf
file_location = save_to_gguf(model_type, new_save_directory, quantization_method, first_conversion, makefile)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/d/.local/lib/python3.11/site-packages/unsloth/save.py", line 964, in save_to_gguf
raise RuntimeError(
RuntimeError: Unsloth: Quantization failed for ./unsloth/llama-3-8b-bnb-4bit-unsloth.F16.gguf
You might have to compile llama.cpp yourself, then run this again.
You do not need to close this Python program. Run the following commands in a new terminal:
You must run this in the same folder as you're saving your model.
git clone --recursive https://github.com/ggerganov/llama.cpp
cd llama.cpp && make clean && LLAMA_CUDA=1 make all -j
Once that's done, redo the quantization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants