-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error while running Phi-3 with DML #336
Comments
Hi @tomas-pet, |
I am using latest |
Any update on this? |
@tomas-pet We need some more information to try and repro this. Can you share the output of pip list? Which model are you using? Did you download it from HuggingFace? Can you share the genai_config.json file from the model folder please. |
Here is output of pip list: accelerate 0.29.2 I am following instructions from here: https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-3-tutorial.md . I am using this model: Phi-3-mini-128k-instruct-onnx. I am using exact instructions from the MD to download model: git clone https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx Attached is genai_config.json. |
Hi @tomas-pet, Which hardware are you using? I see that you have the Nevertheless, we recently made changes to adapter selection that could potentially fix your issue. You can test it out by building from source. |
Still getting same error. the problem is in your phi3-qa.py. This is by default thinking I am using CUDA |
Can you tell me which hardware you're trying to run on? We don't support ARM builds yet, and although it might work with the x64 emulation layer, it's probably not going to be the best experience even if it does work. We'll be adding ARM builds in the future to have a good native experience on those devices. Either way, if you tell me which GPU/hardware you're running on, I can try to see if I can reproduce your issue. |
Hi @tomas-pet, can you please share the hardware you are running on? |
Here was my input command:
python model-qa.py -m Phi-3-mini-128k-instruct-onnx/directml/directml-int4-awq-block-128 -l 2048
Here is the error I am getting:
Input: hi
Output: Traceback (most recent call last):
File "model-qa.py", line 82, in
main(args)
File "model-qa.py", line 47, in main
generator.compute_logits()
onnxruntime_genai.onnxruntime_genai.OrtException: Failed to parse the cuda graph annotation id: -1
The text was updated successfully, but these errors were encountered: