You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does TensorRT-LLM support multinode quantization when checkpoint shards are stored on different machines? For example, using 2 nodes with 8 GPUs each to quantize a model with TP size 16. Using 2 nodes ssh and Openmpi, I'm encountering the error below.
Error:
NotImplementedError: ModelOpt does not support Dtensor sharded models to quantization/restore entry point yet. Please shard the model after quantization/restore
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Does TensorRT-LLM support multinode quantization when checkpoint shards are stored on different machines? For example, using 2 nodes with 8 GPUs each to quantize a model with TP size 16. Using 2 nodes ssh and Openmpi, I'm encountering the error below.
Error:
Beta Was this translation helpful? Give feedback.
All reactions