Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float8 with FSDP and DelayedScaling: 'WeightWithDelayedFloat8CastTensor' object has no attribute '_tensor'. #1605

Open
fmo-mt opened this issue Jan 23, 2025 · 3 comments
Labels

Comments

@fmo-mt
Copy link

fmo-mt commented Jan 23, 2025

I tried:

weight_config = CastConfig(
     scaling_type=ScalingType.DELAYED
)
config = Float8LinearConfig(
        enable_fsdp_float8_all_gather=enable_fsdp_float8_all_gather,
        force_recompute_fp8_weight_in_bwd=enable_fsdp_float8_all_gather, # same as enable_fsdp_float8_all_gather
        cast_config_weight=weight_config,
)
convert_to_float8_training(model, config=config)

and then:

model = FSDP(model, **kwargs)

pytorch throw: AttributeError: 'WeightWithDelayedFloat8CastTensor' object has no attribute '_tensor'. Did you mean: 'new_tensor'?

@vkuzo
Copy link
Contributor

vkuzo commented Jan 23, 2025

hi @fmo-mt , float8 all-gather is a feature which is supported in FSDP2 and not supported in FSDP1. Your code sample looks like it is using FSDP1. Have you tried FSDP2? Some information on FSDP2 is here: https://github.com/pytorch/torchtitan/blob/main/docs/fsdp.md

@vkuzo vkuzo added the float8 label Jan 23, 2025
@fmo-mt
Copy link
Author

fmo-mt commented Jan 24, 2025

Thanks @vkuzo , I do be using FSDP1 lol.
Btw, can I modified them a little so that I can utilize this feature in FSDP1?

@vkuzo
Copy link
Contributor

vkuzo commented Jan 24, 2025

fun fact, one of the reasons FSDP2 was built was to support features such as float8 all-gather. Those are fundamentally difficult to support in FSDP1 due to lack of per-parameter sharding granularity. So, no, there isn't an easy way to get this functionality on FSDP1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants