Skip to content

How to build TensorRT-LLM engine on host and deploy to Jetson Orin Nano Super? #3149

Open
@Sesameisgod

Description

@Sesameisgod

Hi, I’m currently working with TensorRT-LLM and trying to deploy a model (e.g., Qwen2-VL-2B-Instruct) on a Jetson Orin Nano Super. However, due to limited memory on the Nano, I’m unable to build the TensorRT engine directly on the device.

Is there any official or recommended approach to build the TensorRT-LLM engine on a more powerful host machine (with sufficient memory and GPU), and then transfer the generated engine file to the Jetson Orin Nano Super for inference?

If so, are there any considerations or compatibility issues I should be aware of when cross-building the engine on x86 and deploying it on Jetson (aarch64)?

Thanks in advance!

Metadata

Metadata

Labels

questionFurther information is requestedtriagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions