Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package request: TensorRT #25661

Open
2 tasks done
hmaarrfk opened this issue Mar 7, 2024 · 4 comments
Open
2 tasks done

Package request: TensorRT #25661

hmaarrfk opened this issue Mar 7, 2024 · 4 comments

Comments

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Mar 7, 2024

Package name

tensorrt

Package version

Newest

Package website

https://github.com/NVIDIA/TensorRT
https://pypi.org/project/tensorrt/

Package availability

https://pypi.org/project/tensorrt/

Additional comments

It seems that in 2022 this was taken off the cuda build list:
#21382

any particular reason for that? Was the github page for TensorRT missing at the time??

cc: @jakirkham

Package is not available

  • The package is not available on conda-forge.

No previous issues or open PRs

  • No previous issue exists and no PR has been opened.
@jakirkham
Copy link
Member

Thanks for raising Mark! 🙏


First a small clarification on this point:

It seems that in 2022 this was taken off the cuda build list:
#21382

It's still on the list. It is unchecked with the note that it is missing a redist

Screenshot 2024-03-14 at 12 32 24 PM

To provide more context on that, the CUDA packages are built in an internal pipeline that creates a binary redist at the end. Currently we lack that process for TensorRT. So that will need to be built out first. This may take some time

As to the GitHub repo, this is providing only the open source portion, which includes things like examples, samples, some open source code, etc.. However it is still missing the closed source portions, which are not distributed there. Without those, it probably won't be sufficient

AFAICT the PyPI artifact is just an sdist from the GitHub repo. So similarly lacks the closed source portions. Presumably a wheel would be needed with the relevant binary components


In any event have followed up internally and made sure we have an issue tracking this need. Will update the old CUDA listing issue to reference this issue

If you are able to share more about your use cases in this issue, that would be helpful. Are there particular packages that would benefit from building with TensorRT? What use cases do you or others have in mind where TensorRT enabled builds would help?

@hmaarrfk
Copy link
Contributor Author

Thx I saw the note and was confused about how the stance might have changed, but I wanted to revive the issue.

It would be great to have onnxruntime enabled TensorRT. I believe pytorch and tensorflow now have the ability to leverage TensorRT as well.

Onnx has proved to be amazing for ML deployment, so we can plug in TensorRT when the hardware is available.

We've shown that TensorRT for our usecases would create maybe 30% boost in our inference models, but creating a dedicated model for TensorRT might be too much churn for our team. I would rather just have an onnx model and let the client optimize as needed.

@jakirkham
Copy link
Member

Thanks for doing that! 🙏

This also helps signal what is important. NVIDIA has a lot of software. So having input from users about what is needed is helpful

Appreciate the insight. So is it this issue ( conda-forge/onnxruntime-feedstock#109 )? Or is there a different issue that is relevant?

@hmaarrfk
Copy link
Contributor Author

Appreciate the insight. So is it this issue ( conda-forge/onnxruntime-feedstock#109 )? Or is there a different issue that is relevant?

Yes. it is relevant. Truthfully, I opened both issues since it seems that tensorrt is both a C++ library (potentially included by the operating system???? package) and a python library.

NVIDIA has a lot of software. So having input from users about what is needed is helpful

I totally understand that. Your efforts here are definitely not ignored by me and the rest of my team!

From a system standpoint and "edge deployment", the more performance we can edge out of our GPUs, the more likely we are to use them. The engineers on our team have shown that in many cases running the models on the CPUs gives "good enough" performance. Having TensorRT give us a 30% boost in our performance means that we can think of running the models for "real time" use cases, but you know, only when we buy the expensive GPUs ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants