Skip to content

🐛 [Bug] perf gap reduce on RAFT #3731

@zewenli98

Description

@zewenli98

Bug Description

A user reported Torch-TRT has slower inference time on RAFT compared to the original Pytorch model:

Backend                        Time (ms)      Speedup   
----------------------------------------------------------------------
Original                         7.12           -         
Torch-TensorRT                 20.35         0.35x   
ONNX-TRT                         2.96          2.41x

I printed out the engine profiles of Torch-TensorRT and ONNX-TRT where the num of layers of Torch-TRT are twice more than ONNX-TRT.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions