ReMoRa + DoRa improves on ReMoRa

Thank you for sharing your results.  In return I will share my own:

If you reformulate the code so that during the forward pass, it adds the decompressed MoRa weights into the nn.Linear weights, then you reduce the number of multiplies to the normal number.  Furthermore, it becomes compatible with DoRa.  In my testing, alternating between repeat and repeat_interleave (ReMoRa) improves on MoRa continued training, and ReMoRa + DoRa improves on ReMoRa.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReMoRa + DoRa improves on ReMoRa #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ReMoRa + DoRa improves on ReMoRa #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions