-
Notifications
You must be signed in to change notification settings - Fork 917
Issues: huggingface/text-generation-inference
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Unable to run TGI following the instructions on the readme
#2058
opened Jun 12, 2024 by
xianbaoqian
1 of 4 tasks
idefics2: Sizes of tensors must match except in dimension 0. Expected size 448 but got size 447 for tensor number 2 in the list.
#2056
opened Jun 12, 2024 by
pseudotensor
1 of 4 tasks
ROCm: Server error: transport error when running batch size >=2 (Falcon-11B)
#2043
opened Jun 8, 2024 by
almersawi
2 of 4 tasks
GPU memory not saturated using microsoft/Phi-3-small-128k-instruct
#2040
opened Jun 7, 2024 by
calwoo
1 of 4 tasks
RuntimeError: FlashAttention only supports Ampere GPUs or newer.
#2037
opened Jun 7, 2024 by
Ansh-Sarkar
3 of 4 tasks
HuggingFaceM4/idefics2: TGI would crash when I set do_image_splitting to False
#2029
opened Jun 6, 2024 by
newsbreakDuadua9
2 of 4 tasks
4bit quantized model using bnb not able to inference
#2025
opened Jun 5, 2024 by
arihant-neohuman
2 of 4 tasks
Problem of inference with Mixtral-8x7B
RuntimeError: ptxas failed with error code 2
#2009
opened Jun 4, 2024 by
EvanDufraisse
2 of 4 tasks
stop
param doesn't work at all for /v1/completions
endpoint
#1999
opened Jun 3, 2024 by
josephrocca
2 of 4 tasks
Unable to load quantized commandrplus-medusa on H100
#1991
opened Jun 1, 2024 by
sdadas
2 of 4 tasks
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.