Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lora+ implentation #1509

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

moghadas76
Copy link

@BenjaminBossan
Copy link
Member

BenjaminBossan commented Feb 26, 2024

Duplicate of #1504 :)

Sorry about closing (wrong button).

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@moghadas76
Copy link
Author

What was the conclusion in that issue?

@BenjaminBossan
Copy link
Member

No conclusion yet, we want to wait and see if the performance gains are indeed robust. Regarding your code, it's basically just a giant string with the code, right? Was that the intent?

@moghadas76
Copy link
Author

waiting for you to ask implement new Trainer object or not?

@BenjaminBossan
Copy link
Member

Hey, after some discussion, I think we can proceed with this project. Let's add the create_loraplus_optimizer function but not the custom trainer class. We can put the function inside of peft/helpers.py.

Some considerations:

  • Add a reference to the original repo
  • Update the docs
  • Remove the logger code
  • If you feel up for the task, let's add some unit tests.

@BenjaminBossan
Copy link
Member

@moghadas76 do you still plan on working on this?

@moghadas76
Copy link
Author

moghadas76 commented Mar 12, 2024 via email

@BenjaminBossan
Copy link
Member

Great, thanks. On top of what I mentioned, let's also move this to a new file. I'm thinking src/peft/optimizers/loraplus.py. The idea here is that we want to add more optimizer-related methods in the future, so it makes sense to choose a proper file structure right away.

@moghadas76
Copy link
Author

Please review my code

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. It is a good start but there are a few issues, please check my comments. On top of that, could you please move the function out of helpers.py into a separate module, as I mentioned above?

Great, thanks. On top of what I mentioned, let's also move this to a new file. I'm thinking src/peft/optimizers/loraplus.py. The idea here is that we want to add more optimizer-related methods in the future, so it makes sense to choose a proper file structure right away.

Moreover, it would be great to document this function in our PEFT docs, but it would be fine to do that in a follow-up PR.

Finally, please run make style on your changes.

src/peft/tuners/lora/config.py Outdated Show resolved Hide resolved
src/peft/helpers.py Outdated Show resolved Hide resolved
src/peft/helpers.py Outdated Show resolved Hide resolved
src/peft/helpers.py Outdated Show resolved Hide resolved
src/peft/helpers.py Outdated Show resolved Hide resolved
tests/test_loraplus_helper.py Show resolved Hide resolved
src/peft/helpers.py Outdated Show resolved Hide resolved
src/peft/helpers.py Outdated Show resolved Hide resolved
@BenjaminBossan
Copy link
Member

@moghadas76 Do you still plan on working on this?

@moghadas76
Copy link
Author

moghadas76 commented Mar 25, 2024 via email

@BenjaminBossan
Copy link
Member

Yes, I'll fix the comments tonight

Thanks. No need to rush, I just wanted to inquire if you're still on it :)

@BenjaminBossan
Copy link
Member

@moghadas76 LMK once you're finished with your changes and want me to do another review.

@BenjaminBossan
Copy link
Member

Gentle ping @moghadas76

@moghadas76
Copy link
Author

Hi
I fixed the comments
Please review again

@moghadas76
Copy link
Author

@BenjaminBossan

@BenjaminBossan
Copy link
Member

Sorry for the delay, I was at a conference, will review soon.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the adjustments, this already looks quite good. I still found a few minor areas for improvements, which I commented. Also, as mentioned in my earlier comment, could you please move the code to a different file?

src/peft/utils/peft_types.py Outdated Show resolved Hide resolved
src/peft/helpers.py Outdated Show resolved Hide resolved
src/peft/helpers.py Outdated Show resolved Hide resolved
}
optim = create_loraplus_optimizer(model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config)
assert optim is not None
assert len(optim.param_groups) == 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do some more checks here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done...another test added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to also check the specific learning rates we get for the different param groups?

@moghadas76
Copy link
Author

@BenjaminBossan
May I ask final check? Best regards

@BenjaminBossan
Copy link
Member

At the risk of repeating myself the 5th time, could you please make the following change:

Great, thanks. On top of what I mentioned, let's also move this to a new file. I'm thinking src/peft/optimizers/loraplus.py. The idea here is that we want to add more optimizer-related methods in the future, so it makes sense to choose a proper file structure right away.

@moghadas76
Copy link
Author

Sorry... I missed that comment. I've fixed it

@BenjaminBossan
Copy link
Member

Hmm, code quality checks are still failing with:

tests/test_loraplus_helper.py:1:1: I001 [*] Import block is un-sorted or un-formatted

Is it possible that your local ruff version differs? CI uses v0.2.2.

@moghadas76
Copy link
Author

You were right. My ruff version was old.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. Our code style check still fails though, not sure what the reason is if you use the same ruff version. Here is the diff that I get when running ruff locally on your branch:

modified   src/peft/optimizers/__init__.py
@@ -17,4 +17,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from .loraplus import create_loraplus_optimizer
\ No newline at end of file
+from .loraplus import create_loraplus_optimizer
modified   src/peft/optimizers/loraplus.py
@@ -8,20 +8,24 @@ from transformers.trainer_pt_utils import get_parameter_names
 from ..peft_model import PeftModel
 
 
-def create_loraplus_optimizer(model: PeftModel, optimizer_cls: type[Optimizer], optimizer_kwargs: dict, loraplus_lr_embedding: float=1e-6) -> Optimizer:
+def create_loraplus_optimizer(
+    model: PeftModel, optimizer_cls: type[Optimizer], optimizer_kwargs: dict, loraplus_lr_embedding: float = 1e-6
+) -> Optimizer:
     """
-    Creates a LoraPlus optimizer.
-    Implementing LoRA+ https://arxiv.org/abs/2402.12354
-    Reference: https://github.com/nikhil-ghosh-berkeley/loraplus/
+    Creates a LoraPlus optimizer. Implementing LoRA+ https://arxiv.org/abs/2402.12354 Reference:
+    https://github.com/nikhil-ghosh-berkeley/loraplus/
 
     Args:
         model (`torch.nn.Module`): The model to be optimized.
         optimizer_cls (`torch.optim.Optimizer`): The optimizer class to be used.
         optimizer_kwargs (`dict`): Additional keyword arguments to be passed to the optimizer.
-            - **loraplus_lr_ratio** (`float`): The ratio of the learning rate to be used for the embedding layer. Defaults to loraplus_lr_ratio
-            - loraplus_lr_embedding (`float`): The learning rate to be used for the embedding layer. Defaults to loraplus_lr_embedding
+            - **loraplus_lr_ratio** (`float`): The ratio of the learning rate to be used for the embedding layer.
+              Defaults to loraplus_lr_ratio
+            - loraplus_lr_embedding (`float`): The learning rate to be used for the embedding layer. Defaults to
+              loraplus_lr_embedding
     """
     from ..tuners.lora.layer import Embedding
+
     loraplus_lr_ratio = optimizer_kwargs.pop("loraplus_lr_ratio")
 
     decay_parameters = get_parameter_names(model, ALL_LAYERNORM_LAYERS)
@@ -81,6 +85,7 @@ def create_loraplus_optimizer(model: PeftModel, optimizer_cls: type[Optimizer],
     optimizer = optimizer_cls(optimizer_grouped_parameters, **optimizer_kwargs)
     if optimizer_cls.__name__ == "Adam8bit":
         import bitsandbytes
+
         manager = bitsandbytes.optim.GlobalOptimManager.get_instance()
         for module in model.modules():
             if isinstance(module, nn.Embedding):
modified   tests/test_loraplus_helper.py
@@ -25,32 +25,37 @@ def test_lora_plus_helper_sucess():
     model = SimpleNet()
     optimizer_cls = bnb.optim.Adam8bit
     optim_config = {
-        'lr': 5e-5,
-        'eps': 1e-6,
-        'betas': (0.9, 0.999),
-        'weight_decay': 0.0,
+        "lr": 5e-5,
+        "eps": 1e-6,
+        "betas": (0.9, 0.999),
+        "weight_decay": 0.0,
         "loraplus_lr_ratio": 0.2,
     }
-    optim = create_loraplus_optimizer(model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config, loraplus_lr_embedding=1e-6)
+    optim = create_loraplus_optimizer(
+        model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config, loraplus_lr_embedding=1e-6
+    )
     assert optim is not None
     assert len(optim.param_groups) == 4
 
+
 def test_lora_plus_optimizer_sucess():
     optimizer_cls = bnb.optim.Adam8bit
     optim_config = {
-        'lr': 5e-5,
-        'eps': 1e-6,
-        'betas': (0.9, 0.999),
-        'weight_decay': 0.0,
+        "lr": 5e-5,
+        "eps": 1e-6,
+        "betas": (0.9, 0.999),
+        "weight_decay": 0.0,
         "loraplus_lr_ratio": 0.2,
     }
     model: SimpleNet = SimpleNet().cuda()
-    optim = create_loraplus_optimizer(model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config, loraplus_lr_embedding=1e-6)
+    optim = create_loraplus_optimizer(
+        model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config, loraplus_lr_embedding=1e-6
+    )
     loss = torch.nn.CrossEntropyLoss()
     bnb.optim.GlobalOptimManager.get_instance().register_parameters(model.parameters())
     x = torch.randint(100, (2, 4, 10)).cuda()
     output = model(x).permute(0, 3, 1, 2)
-    label = torch.randint(16, (2,4,10,)).cuda()
+    label = torch.randint(16, (2, 4, 10)).cuda()
     loss_value = loss(output, label)
     loss_value.backward()
     optim.step()

assert len(optim.param_groups) == 4

def test_lora_plus_optimizer_sucess():
optimizer_cls = bnb.optim.Adam8bit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a short comment here of what is being tested?

}
optim = create_loraplus_optimizer(model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config)
assert optim is not None
assert len(optim.param_groups) == 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to also check the specific learning rates we get for the different param groups?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants