Explicitly set reentrant to `False` for torch checkpointing #528

an1lam · 2025-03-31T18:10:43Z

Description

The purpose of this PR is to set up OpenFold to be compatible with torch 2.6 (technically >=2.4), in particular for using torch.compile on modules that do activation checkpointing.

As discussed here, torch 2.4 and newer require explicitly passing use_reentrant to the checkpointing function. Prior to torch 2.4 (e.g. 2.2), use_reentrant defaulted to True, however we have found that non-reentrant checkpointing works better with DDP and torch.compile. This is surprisingly hard to find documentation for, but seems to match anecdotal experience of others (ex). As a result, this change forces use_reentrant=False, enabling the use of torch.compile with our structure models.

For Discussion

If maintainers strongly prefer and are concerned about backwards compatibility, we could adapt this PR to have checkpoint_blocks take use_reentrant as a user-specified kwarg that defaults to True and then gets passed to get_checkpoint_fn. Let me know!

* Explicitly set reentrant to False for torch checkpointing As discussed [here](https://pytorch.org/docs/2.6/checkpoint.html#torch.utils.checkpoint.checkpoint), torch 2.4 and newer require explicitly passing to the checkpointing function. Prior to torch 2.4 (e.g. 2.2), use_reentrant defaulted to True, however we have found that reentrant checkpointing does not work with DDP and torch.compile [1]. As a result, this change forces use_reentrant=False to enable us to use torch.compile with our structure models. * Update version * Update checkpointing.py * Update setup.py * Update setup.py

an1lam · 2025-03-31T18:32:06Z

@jnwei apologies for the ping out of nowhere, but are you the right person to be requesting review from? I see you're the most recent person to merge a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explicitly set reentrant to `False` for torch checkpointing #528

Explicitly set reentrant to `False` for torch checkpointing #528

Uh oh!

an1lam commented Mar 31, 2025

Uh oh!

an1lam commented Mar 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Explicitly set reentrant to False for torch checkpointing #528

Are you sure you want to change the base?

Explicitly set reentrant to False for torch checkpointing #528

Uh oh!

Conversation

an1lam commented Mar 31, 2025

Description

For Discussion

Uh oh!

an1lam commented Mar 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Explicitly set reentrant to `False` for torch checkpointing #528

Explicitly set reentrant to `False` for torch checkpointing #528