-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: 'FakeTokenizer' object has no attribute 'encode' #335
Comments
你好,这个问题解决了吗,我也遇到这个问题了 |
遇到这个问题+1,请问解决了吗? |
找到了问题。还没有彻底解决
eternal ?_frank_test
***@***.***
…------------------ 原始邮件 ------------------
发件人: "THUDM/VisualGLM-6B" ***@***.***>;
发送时间: 2024年1月30日(星期二) 下午4:21
***@***.***>;
抄送: "Future ***@***.******@***.***>;
主题: Re: [THUDM/VisualGLM-6B] AttributeError: 'FakeTokenizer' object has no attribute 'encode' (Issue #335)
遇到这个问题+1,请问解决了吗?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
请检查所有的模型路径和分词器路径是否为本地路径,默认是THUDM/Visualglm-6b,如果本地运行的话,需要把路径修改到本地的相应位置,如果能直接访问到huggingface,应该不会发生这个问题 |
visual-glm修改成了本地路径,chatglm那个没看到修改的位置 |
你需要提前把所需要的模型,和分词器都下载到本地,然后在把路径修改到相应路径 |
这里是huggingface的镜像网站,请仔细阅读 |
通过cli-demo下载了,长这样是对的吗? |
是的 |
那分词器是也需要放在visualglm-6b文件夹下吗? |
是的,全都需要,都需要提前准备好,并修改相应配置文件、代码中的路径 |
收到,把model_config.json里的tokenizer_type改成本地chatglm的路径就可以了~ |
Traceback (most recent call last):
File "finetune_visualglm.py", line 194, in
training_main(args, model_cls=model, forward_step_function=forward_step, create_dataset_function=create_dataset_function, collate_fn=data_collator)
File "/home/wangzy/anaconda3/envs/LVM/lib/python3.8/site-packages/sat/training/deepspeed_training.py", line 67, in training_main
train_data, val_data, test_data = make_loaders(args, hooks['create_dataset_function'], collate_fn=collate_fn)
File "/home/wangzy/anaconda3/envs/LVM/lib/python3.8/site-packages/sat/data_utils/configure_data.py", line 200, in make_loaders
train = make_dataset(**data_set_args, args=args, dataset_weights=args.train_data_weights, is_train_data=True)
File "/home/wangzy/anaconda3/envs/LVM/lib/python3.8/site-packages/sat/data_utils/configure_data.py", line 126, in make_dataset_full
d = create_dataset_function(p, args)
File "finetune_visualglm.py", line 160, in create_dataset_function
dataset = FewShotDataset(path, image_processor, tokenizer, args)
File "finetune_visualglm.py", line 118, in init
input0 = tokenizer.encode("", add_special_tokens=False)
AttributeError: 'FakeTokenizer' object has no attribute 'encode'
how can I solve it
The text was updated successfully, but these errors were encountered: