-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/faster rms norm #182
Feat/faster rms norm #182
Conversation
i have made gradients w.r.t weights work properly, except 1 singular test, which just calculates random number of elements wrong with no apparent cause (every other call the number and indices of elements are different), however it's always the same test ( |
@S1ro1 did a fix on the main branch rmsnorm, can you check if the same error persist after rebase? ty |
@lancerts the error stays the same, just few elements mismatched in always 1 and the test same. I have no clue how to fix that to be fair, so I guess you can unassign me, I'm working on this on/off for 3/4 days but haven't got a sinlge step forward since the bug occurance. I guess I took a too big bite. The branch should have some base work done on the issue, but maybe going from scratch might be a better issue. |
#255 resolved by this PR |
Summary
Implements partial aggregation in
rms_norm
, similar to that inlayer_norm
, as described in #179 .Testing Done
make test
to ensure correctnessmake checkstyle
to ensure code stylemake test-convergence
to ensure convergence