You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fun fact, one of the reasons FSDP2 was built was to support features such as float8 all-gather. Those are fundamentally difficult to support in FSDP1 due to lack of per-parameter sharding granularity. So, no, there isn't an easy way to get this functionality on FSDP1.
I tried:
and then:
pytorch throw:
AttributeError: 'WeightWithDelayedFloat8CastTensor' object has no attribute '_tensor'. Did you mean: 'new_tensor'?
The text was updated successfully, but these errors were encountered: