You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have fine-tuned a Llama 3 model and now I would love to use it on a CPU. I tried to use device_map = 'cpu' when loading the model.
However, I am still encountering CUDA issues such as
RuntimeError: CUDA error: an illegal memory access was encountered or my kernel crashing.
Thank you for your answer. I already feared that would be the case.
I was wondering if it is possible to convert the model I already trained with unsloth into transformers? Or is there a way to import the checkpoints into a compatible transformer model?
@code-ksu I believe the model can be loaded directly in to Transformers. Moreover, I dont know your use case but converting to GGUF (llama.cpp) may also help for CPU inference.
Hello,
I have fine-tuned a Llama 3 model and now I would love to use it on a CPU. I tried to use
device_map = 'cpu'
when loading the model.However, I am still encountering CUDA issues such as
After taking a deeper look into the code, I've noticed that many parts are hardwired to use CUDA: https://github.com/search?q=repo%3Aunslothai%2Funsloth+cuda&type=code
Could you provide any tips on how to use my fine-tuned model on the CPU, or let me know if it's not possible?
Thank you!
The text was updated successfully, but these errors were encountered: