Skip to content

Commit b8ff482

Browse files
Dipettjruwasejeffra
authored
Fix: Sparse tensors not updating (#1914)
* Fix do not updated sparse grads * Remove call .data for sparse grads Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Jeff Rasley <[email protected]>
1 parent 5208eb7 commit b8ff482

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

deepspeed/runtime/engine.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2169,7 +2169,8 @@ def _get_gradients_for_reduction(self):
21692169

21702170
grad_data = param.grad.data
21712171
if param_name in self.sparse_tensor_module_names or grad_data.is_sparse:
2172-
grad_data = SparseTensor(grad_data)
2172+
# Call param.grad without data to avoid problem with setting of updated grads
2173+
grad_data = SparseTensor(param.grad)
21732174

21742175
if is_moe_param(param):
21752176
expert_grads[param.group_name].append(grad_data)

0 commit comments

Comments
 (0)