-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lora+ implentation #1509
base: main
Are you sure you want to change the base?
Add lora+ implentation #1509
Conversation
Duplicate of #1504 :) Sorry about closing (wrong button). |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What was the conclusion in that issue? |
No conclusion yet, we want to wait and see if the performance gains are indeed robust. Regarding your code, it's basically just a giant string with the code, right? Was that the intent? |
waiting for you to ask implement new Trainer object or not? |
Hey, after some discussion, I think we can proceed with this project. Let's add the Some considerations:
|
@moghadas76 do you still plan on working on this? |
Yes, This weekend I'll fix the points
…On Tue, Mar 12, 2024, 1:41 PM Benjamin Bossan ***@***.***> wrote:
@moghadas76 <https://github.com/moghadas76> do you still plan on working
on this?
—
Reply to this email directly, view it on GitHub
<#1509 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFRH3KKIYBNPIB4W5RLD5PTYX3ZZPAVCNFSM6AAAAABD2OVEOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJRGU3DGMZQHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Great, thanks. On top of what I mentioned, let's also move this to a new file. I'm thinking |
Please review my code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. It is a good start but there are a few issues, please check my comments. On top of that, could you please move the function out of helpers.py
into a separate module, as I mentioned above?
Great, thanks. On top of what I mentioned, let's also move this to a new file. I'm thinking src/peft/optimizers/loraplus.py. The idea here is that we want to add more optimizer-related methods in the future, so it makes sense to choose a proper file structure right away.
Moreover, it would be great to document this function in our PEFT docs, but it would be fine to do that in a follow-up PR.
Finally, please run make style
on your changes.
@moghadas76 Do you still plan on working on this? |
Yes, I'll fix the comments tonight
…On Mon, Mar 25, 2024, 12:48 PM Benjamin Bossan ***@***.***> wrote:
@moghadas76 <https://github.com/moghadas76> Do you still plan on working
on this?
—
Reply to this email directly, view it on GitHub
<#1509 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFRH3KIOTROBBMBYUMYCHE3Y2AFIPAVCNFSM6AAAAABD2OVEOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJXHAZDIOBVGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks. No need to rush, I just wanted to inquire if you're still on it :) |
@moghadas76 LMK once you're finished with your changes and want me to do another review. |
Gentle ping @moghadas76 |
Hi |
Sorry for the delay, I was at a conference, will review soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the adjustments, this already looks quite good. I still found a few minor areas for improvements, which I commented. Also, as mentioned in my earlier comment, could you please move the code to a different file?
} | ||
optim = create_loraplus_optimizer(model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config) | ||
assert optim is not None | ||
assert len(optim.param_groups) == 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do some more checks here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done...another test added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to also check the specific learning rates we get for the different param groups?
@BenjaminBossan |
At the risk of repeating myself the 5th time, could you please make the following change:
|
Sorry... I missed that comment. I've fixed it |
Hmm, code quality checks are still failing with:
Is it possible that your local ruff version differs? CI uses v0.2.2. |
You were right. My ruff version was old. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. Our code style check still fails though, not sure what the reason is if you use the same ruff version. Here is the diff that I get when running ruff locally on your branch:
modified src/peft/optimizers/__init__.py
@@ -17,4 +17,4 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-from .loraplus import create_loraplus_optimizer
\ No newline at end of file
+from .loraplus import create_loraplus_optimizer
modified src/peft/optimizers/loraplus.py
@@ -8,20 +8,24 @@ from transformers.trainer_pt_utils import get_parameter_names
from ..peft_model import PeftModel
-def create_loraplus_optimizer(model: PeftModel, optimizer_cls: type[Optimizer], optimizer_kwargs: dict, loraplus_lr_embedding: float=1e-6) -> Optimizer:
+def create_loraplus_optimizer(
+ model: PeftModel, optimizer_cls: type[Optimizer], optimizer_kwargs: dict, loraplus_lr_embedding: float = 1e-6
+) -> Optimizer:
"""
- Creates a LoraPlus optimizer.
- Implementing LoRA+ https://arxiv.org/abs/2402.12354
- Reference: https://github.com/nikhil-ghosh-berkeley/loraplus/
+ Creates a LoraPlus optimizer. Implementing LoRA+ https://arxiv.org/abs/2402.12354 Reference:
+ https://github.com/nikhil-ghosh-berkeley/loraplus/
Args:
model (`torch.nn.Module`): The model to be optimized.
optimizer_cls (`torch.optim.Optimizer`): The optimizer class to be used.
optimizer_kwargs (`dict`): Additional keyword arguments to be passed to the optimizer.
- - **loraplus_lr_ratio** (`float`): The ratio of the learning rate to be used for the embedding layer. Defaults to loraplus_lr_ratio
- - loraplus_lr_embedding (`float`): The learning rate to be used for the embedding layer. Defaults to loraplus_lr_embedding
+ - **loraplus_lr_ratio** (`float`): The ratio of the learning rate to be used for the embedding layer.
+ Defaults to loraplus_lr_ratio
+ - loraplus_lr_embedding (`float`): The learning rate to be used for the embedding layer. Defaults to
+ loraplus_lr_embedding
"""
from ..tuners.lora.layer import Embedding
+
loraplus_lr_ratio = optimizer_kwargs.pop("loraplus_lr_ratio")
decay_parameters = get_parameter_names(model, ALL_LAYERNORM_LAYERS)
@@ -81,6 +85,7 @@ def create_loraplus_optimizer(model: PeftModel, optimizer_cls: type[Optimizer],
optimizer = optimizer_cls(optimizer_grouped_parameters, **optimizer_kwargs)
if optimizer_cls.__name__ == "Adam8bit":
import bitsandbytes
+
manager = bitsandbytes.optim.GlobalOptimManager.get_instance()
for module in model.modules():
if isinstance(module, nn.Embedding):
modified tests/test_loraplus_helper.py
@@ -25,32 +25,37 @@ def test_lora_plus_helper_sucess():
model = SimpleNet()
optimizer_cls = bnb.optim.Adam8bit
optim_config = {
- 'lr': 5e-5,
- 'eps': 1e-6,
- 'betas': (0.9, 0.999),
- 'weight_decay': 0.0,
+ "lr": 5e-5,
+ "eps": 1e-6,
+ "betas": (0.9, 0.999),
+ "weight_decay": 0.0,
"loraplus_lr_ratio": 0.2,
}
- optim = create_loraplus_optimizer(model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config, loraplus_lr_embedding=1e-6)
+ optim = create_loraplus_optimizer(
+ model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config, loraplus_lr_embedding=1e-6
+ )
assert optim is not None
assert len(optim.param_groups) == 4
+
def test_lora_plus_optimizer_sucess():
optimizer_cls = bnb.optim.Adam8bit
optim_config = {
- 'lr': 5e-5,
- 'eps': 1e-6,
- 'betas': (0.9, 0.999),
- 'weight_decay': 0.0,
+ "lr": 5e-5,
+ "eps": 1e-6,
+ "betas": (0.9, 0.999),
+ "weight_decay": 0.0,
"loraplus_lr_ratio": 0.2,
}
model: SimpleNet = SimpleNet().cuda()
- optim = create_loraplus_optimizer(model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config, loraplus_lr_embedding=1e-6)
+ optim = create_loraplus_optimizer(
+ model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config, loraplus_lr_embedding=1e-6
+ )
loss = torch.nn.CrossEntropyLoss()
bnb.optim.GlobalOptimManager.get_instance().register_parameters(model.parameters())
x = torch.randint(100, (2, 4, 10)).cuda()
output = model(x).permute(0, 3, 1, 2)
- label = torch.randint(16, (2,4,10,)).cuda()
+ label = torch.randint(16, (2, 4, 10)).cuda()
loss_value = loss(output, label)
loss_value.backward()
optim.step()
assert len(optim.param_groups) == 4 | ||
|
||
def test_lora_plus_optimizer_sucess(): | ||
optimizer_cls = bnb.optim.Adam8bit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a short comment here of what is being tested?
} | ||
optim = create_loraplus_optimizer(model=model, optimizer_cls=optimizer_cls, optimizer_kwargs=optim_config) | ||
assert optim is not None | ||
assert len(optim.param_groups) == 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to also check the specific learning rates we get for the different param groups?
Implementing LoRA+ https://arxiv.org/abs/2402.12354