-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to run FLAN-T5 inference on GCP TPU v3 (TF 2.16.1) #30901
Comments
Hey! Looking at the traceback, and the code, it seems like you are using custom code + |
cc @sayakpaul - do we have any examples for XLA generation on TPU? Also, one thing I'd point out is that in general, |
No, I don't think. Additionally, I concur with your suggestions here:
|
Understood! @sumanthratna let me know if you can't get it working, and I'll try to reproduce the issue here and make a working example |
Thanks for the replies! @sayakpaul @Rocketknight1 I did try without It may also be relevant that I’m not able to run the above code using just XLA (without tf Strategy) — will post reproduction here shortly |
Hi @sumanthratna, yeah - our recommendation for TPU debugging is to first get the code working on CPU/GPU with We have a guide specifically on XLA generation with TensorFlow here, which might help! |
XLA (using CPU) on a CPU machineSee here for my successful go at running inference with XLA on CPU: https://colab.research.google.com/drive/1KOrB7DBm92isAfvsiQcaMUs1uZGCQyUp?usp=sharing. XLA (using CPU) on a TPU VMwhen I run the above code from the notebook (without
|
Yeah - cc @gante, did we ever try XLA generation on TPU? Should we expect it to work at all, or would users need something more like a simplified manual generation loop? |
@Rocketknight1 nope, at least I haven't :D |
Hmn, okay - I'm afraid this is just really untested @sumanthratna! If you get it working, please let us know, and we can document it somewhere, but you might have to code some kind of manual generation loop for TPU instead. |
System Info
transformers
version: 4.41.0Who can help?
@gante, @Rocketknight1, @ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
gcloud compute tpus tpu-vm create my-tpu-vm --zone=us-central1-a --accelerator-type=v3-8 --version=tpu-vm-tf-2.16.1-pjrt
gcloud compute tpus tpu-vm ssh --zone "us-central1-a" "my-tpu-tvm" --project $GCP_PROJECT_NAME
python3 -m pip install transformers sentencepiece
python3 test-hf.py
traceback:
The same behavior occurs when using StreamExecutor rather than PJRT. The same behavior occurs when removing
jit_compile=True
fromtf.function(model.generate, jit_compile=True)
.Expected behavior
I expect generation to succeed and yield the following final output:
The text was updated successfully, but these errors were encountered: