cpu memory optimization rebased to main #3868

cehongwang · 2025-10-16T22:22:46Z

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

narendasan · 2025-10-29T20:07:57Z

docsrc/contributors/resource_management.rst

Make sure to add a link to the resource_management page in index.rst

docsrc/contributors/resource_management.rst

narendasan · 2025-10-29T21:44:21Z

docsrc/contributors/resource_management.rst

+
+   This shifts one model copy from GPU to CPU memory.  
+   As a result, peak GPU memory usage decreases to about **1×** 
+   the model size, while CPU memory usage increases by roughly **1×**.


This is a bit confusing, can we say increases to roughly **2x** model size

We cannot say that because it depends on what users choose other CPU memory optimization options. Can we say "one more copy of the model will occupy the CPU memory" to make it more clear?

py/torch_tensorrt/dynamo/conversion/_conversion.py

narendasan · 2025-10-29T21:45:52Z

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py


    @needs_refit  # type: ignore[misc]
-    def _insert_engine_to_cache(self, hash_val: str, serialized_engine: bytes) -> None:
+    def _insert_engine_to_cache(self, hash_val: str, engine: trt.ICudaEngine) -> None:


@zewenli98 when do these calls run? will this conflict with the goal of keeping mem usage under 3x?

should we do caching in a post processing step?

Like we can give the cache entry as one of the Interpreter Result fields

py/torch_tensorrt/dynamo/partitioning/_adjacency_partitioner.py

narendasan · 2025-10-30T18:40:07Z

py/torch_tensorrt/dynamo/conversion/_conversion.py

    settings: CompilationSettings = CompilationSettings(),
    arg_inputs: Optional[Sequence[Input]] = None,
    kwarg_inputs: Optional[dict[str, Any]] = None,
    engine_cache: Optional[BaseEngineCache] = None,


We need to move the cache insert here

meta-cla bot added the cla signed label Oct 16, 2025

github-actions bot requested a review from peri044 October 16, 2025 22:23

github-actions bot added the documentation Improvements or additions to documentation label Oct 29, 2025

cehongwang force-pushed the cpu-memory-optimization-rebase-main branch from b9b6aeb to 51f64f0 Compare October 29, 2025 17:30

narendasan reviewed Oct 29, 2025

View reviewed changes

docsrc/contributors/resource_management.rst

Copy link

Collaborator

narendasan Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to add a link to the resource_management page in index.rst

narendasan reviewed Oct 29, 2025

View reviewed changes

narendasan reviewed Oct 30, 2025

View reviewed changes

Squash the commits

f77df5a

cehongwang force-pushed the cpu-memory-optimization-rebase-main branch from 51f64f0 to f77df5a Compare October 30, 2025 20:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cpu memory optimization rebased to main #3868

cpu memory optimization rebased to main #3868

cehongwang commented Oct 16, 2025

Uh oh!

narendasan Oct 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

narendasan Oct 29, 2025

Uh oh!

cehongwang Oct 30, 2025

Uh oh!

Uh oh!

narendasan Oct 29, 2025

Uh oh!

narendasan Oct 29, 2025

Uh oh!

narendasan Oct 29, 2025

Uh oh!

Uh oh!

Uh oh!

narendasan Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cpu memory optimization rebased to main #3868

Are you sure you want to change the base?

cpu memory optimization rebased to main #3868

Conversation

cehongwang commented Oct 16, 2025

Description

Type of change

Checklist:

Uh oh!

narendasan Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

narendasan Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

cehongwang Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

narendasan Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

narendasan Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

narendasan Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

narendasan Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants