🐛 [Bug] Disable CPU offloading by default in MTTM

##  Bug Description

CPU offloading is enabled by default in MTTM which causes device mismatch issues for embedding layers. 
Eg: 
Here is the code for VLM component of Groot model: https://github.com/NVIDIA/Isaac-GR00T/blob/main/gr00t/model/backbone/eagle2_hg_model/modeling_eagle2_5_vl.py#L235

Once the language model is compiled with MTTM, it is moved to CPU. So, this operation fails since `input_ids`  tensor is on the GPU while the embedding layer (`self.embed_tokens`) is on CPU. 

offload_module_to_cpu isn't supported in MTTM. So adding the support will fix this issue. The following works 
```py
if self.additional_settings.get("offload_module_to_cpu", False):
     deallocate_module(self.original_model, delete_module=False)
```
But there are multiple places where `deallocate_module` is being used which needs to be investigated. 

## To Reproduce

Steps to reproduce the behavior:

1.
2.
3.



## Expected behavior



## Environment

> Build information about Torch-TensorRT can be found by turning on debug messages

 - Torch-TensorRT Version (e.g. 1.0.0):
 - PyTorch Version (e.g. 1.0):
 - CPU Architecture:
 - OS (e.g., Linux):
 - How you installed PyTorch (`conda`, `pip`, `libtorch`, source):
 - Build command you used (if compiling from source):
 - Are you using local sources or building from archives:
 - Python version:
 - CUDA version:
 - GPU models and configuration:
 - Any other relevant information:

## Additional context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 [Bug] Disable CPU offloading by default in MTTM #3728

Bug Description

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🐛 [Bug] Disable CPU offloading by default in MTTM #3728

Description

Bug Description

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions