-
Notifications
You must be signed in to change notification settings - Fork 326
Description
Is your feature request related to a problem? Please describe.
The current implementation of the enable_lora method in Keras only allows LoRA to be enabled for a predefined set of layers ("query_dense", "value_dense", "query", "value"). This is limiting because it doesn't provide flexibility for users who want to apply LoRA to other layers.
- In many custom architectures, users may want to enable LoRA for different layers that do not follow the predefined naming convention, which leads to unnecessary modifications to the codebase.
- Additionally, enabling LoRA on multiple layers can significantly increase the number of trainable parameters in a model, which could lead to memory issues, especially on devices with limited resources (such as GPUs with lower VRAM). Users may want finer control over which layers have LoRA enabled to avoid memory bottlenecks during training.
Describe the solution you'd like
I propose modifying the enable_lora method to accept a list of custom layer names as a parameter. This way, users can dynamically specify which layers they want to apply LoRA to. The solution could involve adding an optional target_names argument to the method that defaults to the current predefined layers but can be overridden by users. For example:
def enable_lora(self, rank, target_names=None):
target_names = target_names if target_name else ["query_dense", "value_dense", "query", "value"]
This enhancement would allow more flexibility in applying LoRA to various architectures without needing to modify the core method.
Describe alternatives you've considered
Some of the alternatives thought about:-
1. Inheritance and Override: Another alternative is to subclass the relevant Keras model classes and override the enable_lora method with a custom implementation. While this avoids direct modification of the library, it can still introduce complexity and redundancy for a common use case.
2. Defining Custom Lora layer and adding it manually to the model layers: Another alternative is to manually define a custom LoRA layer and adapt it to different parts of the model. I have personally experimented with this approach, and while it works, it's a tedious process. You need to manually implement the LoRA logic, inject it into the model.
Additional context
This feature would provide better control for users who are training models on resource-constrained environments and make it easier to integrate LoRA into custom models, especially when experimenting.
Note: I would like to take up this issue and implement the proposed changes. If this feature request is accepted, I will raise a pull request (PR) with the necessary modifications.