Learning the combination weights of pre-trained LoRA Modules #1655

mahdibeit · 2024-04-15T18:25:21Z

Feature request

PEFT can combine pre-trained LoRA modules by averaging them or providing custom weights for weighted averaging. This paper showed that learning these weights is better than naive averaging in few-shot adaption settings.

Motivation

Learning the combination weights allows users to utilize already available pre-trained LoRA modules in the Hugging Face Models. Also, it is very parameter efficient since we are only learning the combination weights. More importantly, it can surpass learning a LoRA from scratch in settings where the number of training samples is limited.

Your contribution

I can submit a PR. Then, PEFT can combine any pre-trained LoRA using the following format:

wlora_config = WLoraConfig(skilled_loras = [PATH_TO_UPSTREAM_1, PATH_TO_UPSTREAM_2, ])

model = get_peft_model(llama2, wlora_config )

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2024-04-16T12:34:09Z

Hi, thanks you for proposing to add this method.

I only skimmed the paper but IIUC, we assume that the user has a couple of already trained LoRA adapters and now wants to combine them for a new task. The idea is that by learning the weights used for the weighted average (the weights argument for add_weighted_adapter) can lead to better results than naive uniform weights. (Note that we offer many combination types, not just averaging, maybe that's worth looking into for the paper.)

To learn these weights, I assume we have to load all the LoRA adapters at training time, freeze their weights, then add an extra scaling factor to this line, is that right?

I haven't thought through the overall design of this, but I think it should be possible to add this to the existing LoRA code without too many additions. Feel free to open a draft PR where we can discuss the design.

mahdibeit · 2024-04-16T21:02:15Z

Hi @BenjaminBossan , thanks for taking the time to read the paper.

Thank you for your great suggestion. We will evaluate other combination methods in the paper.

To answer your question, yes, you are absolutely right. We can just use the existing LoRA code with a Boolean like learn_combination_weights to configure the training process. Overall, I need to do the following:

Instantiate a trainable tensor named combination_weights that learns the scaling factor for each pre-trained, frozen LoRA.
During the forward pass, call softmax over combination_weights and then multiply each LoRA by the corresponding index of the combination_weights in this line .

During these steps, I have to make sure that LoRA weights are frozen and merge method works as intended.

Also, it is possible to create a new peft/tuner module named something like WLoRA and write the appropriate class in there. This allows users to just run the following

wlora_config = WLoraConfig(upstream_loras = [PATH_TO_UPSTREAM_1, PATH_TO_UPSTREAM_2, ])

model = get_peft_model(llama2, wlora_config )

I personally prefer the second option as it does not complicate the main LoRA module and allows easier use. However, I trust your judgment. Let me know which direction you prefer and I can start implementing and opening a draft PR.

BenjaminBossan · 2024-04-17T09:03:40Z

I think the 2nd suggestion with a dedicated class is good, it can still re-use much of the existing code, though. Regardless, if you have some code to share, feel free to do so, as that makes the discussion much easier.

mahdibeit · 2024-04-22T01:25:29Z

Hi @BenjaminBossan, I just opened a draft PR using the first method. Let me know what you think. The main concern that I have is this line to freeze pre-trained lora_A and lora_B.

github-actions · 2024-05-16T15:03:36Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

mahdibeit mentioned this issue Apr 22, 2024

supports learning the combination weights of pre-trained LoRA modules #1666

Draft

github-actions bot closed this as completed May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learning the combination weights of pre-trained LoRA Modules #1655

Learning the combination weights of pre-trained LoRA Modules #1655

mahdibeit commented Apr 15, 2024 •

edited

BenjaminBossan commented Apr 16, 2024

mahdibeit commented Apr 16, 2024

BenjaminBossan commented Apr 17, 2024

mahdibeit commented Apr 22, 2024

github-actions bot commented May 16, 2024

Learning the combination weights of pre-trained LoRA Modules #1655

Learning the combination weights of pre-trained LoRA Modules #1655

Comments

mahdibeit commented Apr 15, 2024 • edited

Feature request

Motivation

Your contribution

BenjaminBossan commented Apr 16, 2024

mahdibeit commented Apr 16, 2024

BenjaminBossan commented Apr 17, 2024

mahdibeit commented Apr 22, 2024

github-actions bot commented May 16, 2024

mahdibeit commented Apr 15, 2024 •

edited