You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LLaVA finetuned by xtuner cannot be loaded like Qwen-VL. Developers of Qwen-VL also put a model file (modeling_qwen.py) in their huggingface repo so they can load their model in this way. However, it also means Qwen-VL only allows a fixed model architecture. By contrast, LLaVA finetuned by xtuner allows different model architectures, like CLIP+Vicuna, CLIP+internlm, CLIP+internlm2, Dinov2+internlm, etc.
LLaVA finetuned by xtuner can be deployed in another way (pull [Refactor & Feature] Refactor xtuner chat to support lmdeploy &vLLM #317, still under development). You can deploy the LLaVA model with huggingface llava chatbot (based on Huggingface transformers) or lmdeplot llava chatbot (based on LMDeploy Turbomind). The two chatbots share the same interface.
I have finished the finetuning of llava and done the benchmark. And I got files like xtuner/llava-internlm2-7b in HuggingFace.
How to run the model without xtuner (i.e., without using xtuner chat)? For example, Qwen-VL can be loaded by Huggingface Transformers only:
The text was updated successfully, but these errors were encountered: