Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change english text_encoder to other language? #117

Open
jammyWolf opened this issue Dec 29, 2022 · 2 comments
Open

change english text_encoder to other language? #117

jammyWolf opened this issue Dec 29, 2022 · 2 comments

Comments

@jammyWolf
Copy link

jammyWolf commented Dec 29, 2022

Hello author, thx to the great work! i want to use ALBEF to train another language-image multi model, i am a little confused about the finetune procedure.

Here's my options below:

  1. load your repo's pth file, and iterates the parmeters.
  2. load parameters from Bert model: bert-base-chinese to ALBEF model which tensor name contains text_encoder to pretrained
  3. freeze the parameters in ALBEF model which tensor name contains visual_encoder.

code like this below:
`

tokenizer = BertTokenizer.from_pretrained(args.text_encoder) #load chinese bert pretrained model
model = ALBEF(config=config, text_encoder=args.text_encoder, tokenizer=tokenizer)
model_dict = model.state_dict()
# load parameters in your ckpt file, but leave out tensors which name contains text_encoder
temp = {}
pretrained_dict = torch.load(args.checkpoint, map_location='cpu')['model']
for k, v in pretrained_dict.items():
    if k.find("text_encoder") == -1 and model_dict[k].shape==v.shape:  
        temp[k] = v

# replace parameters in text_encoder and freeze visual_encoder
temp_update = {}
for k, v in model_dict.items():
    if k in temp.keys():
        if k.find("visual_encoder") != -1:
            temp[k].requires_grad = False
        temp_update[k] = temp[k]
    else:
        temp_update[k] = v
model_dict.update(temp_update)
model.load_state_dict(model_dict)

`

finally i found bad recall score in flicker-cn dataset, could you give me some advise?

@LiJunnan1992
Copy link
Contributor

Hi, it won't work if you directly replace bert-en to bert-cn, as the parameters of these two models are different. ALBEF is pre-trained using bert-en and cannot be directly applied to bert-cn.

@ozanciga
Copy link

ozanciga commented May 4, 2023

hey @LiJunnan1992 what is your opinion on using adapters to update the pretrained model? or something like low rank adaptation (lora)? wondering if such partial training can be applied to alignment of these models. seems possible, but i would like an expert opinion as setup may be costly in time and compute. thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants