-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Issues: triton-inference-server/server
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Milestones
Assignee
Sort
Issues list
Support histogram custom metric in Python backend
enhancement
New feature or request
#7287
opened May 28, 2024 by
ShuaiShao93
What is the correct way to run inference in parallel in Triton?
#7283
opened May 28, 2024 by
sandesha-hegde
A Confusion about prefetch
performance
A possible performance tune-up
question
Further information is requested
#7282
opened May 28, 2024 by
SunnyGhj
Windows 10 docker build Error "Could not locate a complete Visual Studio instance"
investigating
The developement team is investigating this issue
#7281
opened May 28, 2024 by
jinkilee
Automatically unload (oldest) models when memory is full
enhancement
New feature or request
#7279
opened May 27, 2024 by
elmuz
No 24.05-trtllm-python-py3 in NGC Repo
question
Further information is requested
#7277
opened May 25, 2024 by
avianion
[Bug] Model 'ensemble' receives inputs originated from different decoupled models
#7275
opened May 25, 2024 by
michaelnny
Triton BLS model with dynamic batching does not execute expected batch size.
investigating
The developement team is investigating this issue
#7271
opened May 24, 2024 by
njaramish
Tritonserver hangs on launch with python backend
investigating
The developement team is investigating this issue
#7268
opened May 24, 2024 by
JamesBowerXanda
Custom backend using recommended.cc not generating correct output
investigating
The developement team is investigating this issue
#7266
opened May 24, 2024 by
jgrsdave
Pods Receiving Traffic Too Early When Scaling with HPA Causes 'Socket Closed' Errors on Triton Inference Server
investigating
The developement team is investigating this issue
#7264
opened May 23, 2024 by
patriksabol
Add to the serve-side metrics on the input and output sizes
#7263
opened May 23, 2024 by
yongbinfeng
launch_triton_server.py attempts to place two models on the same GPU instead of one model on two GPUs
#7255
opened May 21, 2024 by
ethan-digi
Previous Next
ProTip!
Adding no:label will show everything without a label.