Improve the efficiency of the RMSNorm aggregation #179

lancerts · 2024-08-30T15:07:19Z

🚀 The feature, motivation and pitch

Modify this line https://github.com/linkedin/Liger-Kernel/blob/main/src/liger_kernel/ops/rms_norm.py#L306, the sum in pytorch to partial aggregation in triton, reference
https://github.com/linkedin/Liger-Kernel/blob/main/src/liger_kernel/ops/layer_norm.py#L106,
which does 2 level of aggregation, first in triton and second in torch (more efficient).

Alternatives

No response

Additional context

No response

ByronHsu · 2024-08-30T16:07:35Z

cc @AndreSlavescu

S1ro1 · 2024-08-30T16:52:14Z

i would like to try this one, #take @lancerts, edit: not sure how well i can do this, but would like to try, so idk if good to assign to me

ByronHsu · 2024-08-30T17:44:48Z

you can refer to layernorm in triton tutorial

lancerts · 2024-08-30T18:01:13Z

The https://triton-lang.org/main/getting-started/tutorials/05-layer-norm.html#sphx-glr-getting-started-tutorials-05-layer-norm-py uses the atomic operation. We can just use similar method in https://github.com/linkedin/Liger-Kernel/blob/main/src/liger_kernel/ops/layer_norm.py#L106, which does not use atomic operation.

S1ro1 · 2024-08-30T18:49:52Z

yes, i did go similar to liger layer_norm implementation, however i'm having some issues with numerical stability, is it ok to increase the absolute tolerance in tests? I'll create a draft PR asap, need to fix some issues @lancerts

lancerts assigned ByronHsu Aug 30, 2024

ByronHsu removed their assignment Aug 30, 2024

ByronHsu assigned S1ro1 Aug 30, 2024

S1ro1 mentioned this issue Aug 30, 2024

Feat/faster rms norm #182

Draft

3 tasks

Tcc0403 linked a pull request Sep 19, 2024 that will close this issue

RMSNorm aggregation #255

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the efficiency of the RMSNorm aggregation #179

Improve the efficiency of the RMSNorm aggregation #179

lancerts commented Aug 30, 2024 •

edited

Loading

ByronHsu commented Aug 30, 2024

S1ro1 commented Aug 30, 2024 •

edited

Loading

ByronHsu commented Aug 30, 2024

lancerts commented Aug 30, 2024

S1ro1 commented Aug 30, 2024

Improve the efficiency of the RMSNorm aggregation #179

Improve the efficiency of the RMSNorm aggregation #179

Comments

lancerts commented Aug 30, 2024 • edited Loading

🚀 The feature, motivation and pitch

Alternatives

Additional context

ByronHsu commented Aug 30, 2024

S1ro1 commented Aug 30, 2024 • edited Loading

ByronHsu commented Aug 30, 2024

lancerts commented Aug 30, 2024

S1ro1 commented Aug 30, 2024

lancerts commented Aug 30, 2024 •

edited

Loading

S1ro1 commented Aug 30, 2024 •

edited

Loading