Inquiry on any updated support for tensorrt-llm support nvidia orin AGX?

philipamadasun1 · May 29, 2025, 7:48pm

It does not seem like tensorrt-llm v-0.12.0 supports llama3 models (I tried building engines for llama 3.2-1B), is it possible to build tensorrt-llm from source to get that support? If so, has anyone successfully done so? If there is any updated tensorrt-llm built specifically for the AGX to support newer models I would love to know.

whitesscott · May 31, 2025, 2:04am

I think the answer is for now, no, for the following reasons.

TensorRT-LLM/v0.12.0-jetson is the only version of trtllm that built the kernels for the jetson/tegra/aarch64 architecture.

TensorRT-LLM/v0.12.0-jetson support-matrix.md lists LLaMA/LLaMA 2/LLaMA 3/LLaMA 3.1

I attempted compilation of -b 0.21.0rc? TensorRT-LLM

It fails at several places on Jetson AGX Orin devkit.

the trtllm kernels won’t build on Orin.

trtllm is built against tensorrt 10.11.0.33. agx orin has tensorrt 10.7.0.23-1.
conan.io requires tensorrt python package version 10.10.0.31 to be installed so that it can then install version 10.11.0.33.

Collecting tensorrt_cu12==10.10.0.31 (from tensorrt~=10.10.0->-r /home/scott/.git/TensorRT-LLM/requirements.txt (line 23))
Using cached tensorrt_cu12-10.10.0.31.tar.gz (18 kB)
Preparing metadata (setup.py) … error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File “”, line 2, in
File “”, line 35, in
File “/tmp/pip-install-2bxsrteq/tensorrt-cu12_4ceb32d6a92145169c08a0fa6ccdfa28/setup.py”, line 71, in
raise RuntimeError(“TensorRT does not currently build wheels for Tegra systems”)
RuntimeError: TensorRT does not currently build wheels for Tegra systems
[end of output RuntimeError: TensorRT does not currently build wheels for Tegra systems]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

I used to be able to compile versions earlier than v0.19.0 or maybe 18.0 but they never worked as the kernels were no built.

Around v0.19.0 trtllm changed its compilation method to using conan.io software package manager running within its own python .venv.

philipamadasun1 · June 2, 2025, 10:49pm

I tried to build tensorrt-llm from source as so:

git clone https://github.com/NVIDIA/TensorRT-LLM.git
cd TensorRT-LLM
git submodule update --init --recursive
git lfs pull

make -C docker release_build CUDA_ARCHS="'86-real;87-real'"

I followed the official tutorial.
I had to edit the Dockerfile.multi file but after that it seemed to have built correctly. But I can’t build any engines without running into the Aborted core dumped error. How were you even able to build anything above the official 0.12.0 version? Would you happen to have those docker images still? If so would they support llama3.2?

If not I don’t know how nvidia expects developers to do anything serious on their edge devices if we have to wait for them to provide the support.

whitesscott · June 3, 2025, 12:15am

TensorRT-LLM/v0.12.0-jetson is the only version of trtllm that built the kernels for the jetson/tegra/aarch64 architecture.

Every other release only builds the cpp *.cu kernel binaries for x86_64.

Up until TensorRT-LLM_15 I was able to build its wheel. but it was never functional as it did not build the kernels. I tried many different methods and looked a the trtllm code of pull requests where individuals tried to build for aarch64_Jetson. The one person that succeded is an Nvidia employee who was building for another purpose and when dusty-nv got trtllm to build v0.12.0-jetson so we would have something that worked.

At some point there existed somewhere at Nvidia the CMakeLists.txt files that would build the kernels for aarch64/jetson but that must be proprietary not-open source code because the files have never been published on {github.com, nvidia · GitLab, nv-tegra.nvidia.com that I’ve been able to find.

system · June 18, 2025, 6:48am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorRT-LLM for Jetson Jetson AGX Orin generative_ai	10	2159	April 21, 2025
Get error message as conver qwen to int-gptq in tensorrt-llm for agx orin DRIVE AGX Orin General driveworks-dnn-framework	4	135	December 10, 2024
Can I use TensorRT-LLM in Jetson AGX orin? Jetson AGX Orin nvbugs , generative_ai	3	648	July 15, 2024
ERROR: No matching distribution found for tensorrt_llm==0.9.0 TensorRT llama	0	88	February 5, 2025
Tensorrt on Jetson with python 3.9 Jetson Xavier NX tensorrt , python	9	6866	January 12, 2022
Can TensorRT-LLM be used on Jetson Orin NX with JetPack 6.1? Jetson Orin NX tensorrt , generative_ai	6	236	December 17, 2024
Jetson Orin Nano Developer Kit, Jetpack, Cuda, Tensorflow with GPU and TensorRT Jetson Orin Nano tensorflow	16	3606	September 28, 2023
Error install torch_tensorrt TensorRT cudnn	5	673	January 31, 2024
Can not install tensorrt on Jetson Orin NX Jetson Orin NX tensorrt	2	606	June 14, 2023
Nvidia jetson orin nano has tensorrt support? Jetson Orin Nano tensorrt	2	44	April 7, 2025

Inquiry on any updated support for tensorrt-llm support nvidia orin AGX?

Related topics