[Feature] Implement COG-VLM2 #1622

isidentical · 2024-05-20T20:09:32Z

Motivation

CogVLM2 is now the SOTA open source VLM for captioning tasks.

Related resources

No response

Additional context

No response

RunningLeon · 2024-05-21T03:00:41Z

@isidentical hi, thanks for your information. We will include cogvlm2 after pr #1502 is merged.

Jayantverma2 · 2024-05-23T16:49:19Z

any update?

RunningLeon · 2024-05-24T11:57:20Z

any update?

hi, it's in progress. Any update will sync to this issue.

RunningLeon · 2024-05-28T04:25:01Z

@isidentical @Jayantverma2 hi, guys. CogVLM2 models are supported in PR #1502. If you have time, have a try. Welcome to leave any comments in the PR. THX.

Tushar-ml · 2024-05-29T07:42:26Z

@RunningLeon Is this the correct way to initialize the cogvlm2?

engine = pipeline(model_path, "cogvlm2",log_level="DEBUG")
I have made some changes to config.json

{
"architectures": [
"CogVLMForCausalLM"
],
"auto_map": {
"AutoConfig": "configuration_cogvlm.CogVLMConfig",
"AutoModelForCausalLM": "modeling_cogvlm.CogVLMForCausalLM"
},
"vision_config": {
"dropout_prob": 0.0,
"hidden_act": "gelu",
"in_channels": 3,
"num_hidden_layers": 63,
"hidden_size": 1792,
"patch_size": 14,
"num_heads": 16,
"intermediate_size": 15360,
"layer_norm_eps": 1e-06,
"num_positions": 9217,
"image_size": 1344
},
"hidden_size": 4096,
"intermediate_size": 14336,
"num_attention_heads": 32,
"max_position_embeddings": 8192,
"rms_norm_eps": 1e-05,
"template_version": "chat",
"initializer_range": 0.02,
"bos_token_id": 128000,
"eos_token_id": [128001, 128009],
"pad_token_id": 128002,
"vocab_size": 128256,
"num_hidden_layers": 32,
"hidden_act": "silu",
"use_cache": true,
"transformers_version": "4.41.0"
}

But when I am running this with this prompt
prompts = [ { 'role': 'user', 'content': [ {'type': 'text', 'text': prompt}, {'type': 'image_url', 'image_url': {'url': f'data:image/jpeg;base64,{image}'}} ] } ]
it is generating b''

RunningLeon · 2024-05-29T08:28:11Z

@Tushar-ml hi, pls. follow examples in the document: https://lmdeploy.readthedocs.io/en/latest/inference/vl_pipeline.html#vlm-offline-inference-pipeline.

prompts should be like

prompts = [
    {
        'role': 'user',
        'content': [
            {'type': 'text', 'text': 'describe this image'},
            {'type': 'image_url', 'image_url': {'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg'}}
        ]
    }
]

Tushar-ml · 2024-05-29T19:18:55Z

@RunningLeon any docs how to run this CogVLM2 as in PR mentioned, Tokenizer need to be applied manually

pseudotensor · 2024-05-29T22:43:22Z

awesome, look forward to it. Really like lmdeploy because it's much more stable than sglang for these vision models.

RunningLeon · 2024-05-30T02:25:54Z

@RunningLeon any docs how to run this CogVLM2 as in PR mentioned, Tokenizer need to be applied manually

@Tushar-ml hi, no need to do so for cogvlm2, but should do for cogvlm(1).

RunningLeon · 2024-05-30T02:30:18Z

awesome, look forward to it. Really like lmdeploy because it's much more stable than sglang for these vision models.

@pseudotensor hi, glad to hear that. If possible, please recommend lmdeploy to other people who are interested in deploying LLMs and VLMs. Thanks.

pseudotensor · 2024-05-30T02:36:24Z

awesome, look forward to it. Really like lmdeploy because it's much more stable than sglang for these vision models.

@pseudotensor hi, glad to hear that. If possible, please recommend lmdeploy to other people who are interested in deploying LLMs and VLMs. Thanks.

Yes, will gladly do that.

Tushar-ml · 2024-05-30T04:40:49Z

@RunningLeon I am getting OOM in A40G, 48 GRAM. What is the recommended system for cogvlm2, as model is of size not more than 40gb

RunningLeon · 2024-05-30T08:37:52Z

@RunningLeon I am getting OOM in A40G, 48 GRAM. What is the recommended system for cogvlm2, as model is of size not more than 40gb

@Tushar-ml hi, could you provide your sample code? Normally, you can reudce cache_max_entry_count to reduce kv mem size and reduce max_prefill_token_num from PytorchEngineConfig

lmdeploy/lmdeploy/messages.py

Lines 202 to 230 in 5a2aaf1

    
               cache_max_entry_count: float = 0.8 
        
               eviction_type: str = 'recompute' 
        
               prefill_interval: int = 16 
        
               block_size: int = 64 
        
               num_cpu_blocks: int = 0 
        
               num_gpu_blocks: int = 0 
        
               adapters: Dict[str, str] = None 
        
               max_prefill_token_num: int = 4096 
        
               thread_safe: bool = False 
        
               enable_prefix_caching: bool = False 
        
               download_dir: str = None 
        
               revision: str = None 
        
               def __post_init__(self): 
        
                   """Check input validation.""" 
        
                   assert self.tp >= 1, 'invalid tp' 
        
                   assert self.max_batch_size >= 1, 'invalid max_batch_size' 
        
                   assert self.cache_max_entry_count > 0 and self.cache_max_entry_count < 1, 'invalid cache_max_entry_count'  # noqa 
        
                   assert self.eviction_type in ('recompute', 
        
                                                 'copy'), 'invalid eviction_type' 
        
                   assert self.num_cpu_blocks >= 0, 'invalid num_cpu_blocks' 
        
                   assert self.max_prefill_token_num >= 0, 'invalid max_prefill_token_num' 
        
                   assert self.num_gpu_blocks >= 0, 'invalid num_gpu_blocks' 
        
           class ResponseType(enum.Enum): 
        
               """Response type.""" 
        
               SUCCESS = enum.auto()

Tushar-ml · 2024-05-30T09:39:37Z

Thanks @RunningLeon I will try this

GuoXu-booo · 2024-06-03T01:01:38Z

@RunningLeon Hi！
Due to server network limitations, I could not compile and install the latest lmdeploy on the server, so I downloaded an image of lmdeploy0.4.2 on docker hub and ran it, then ran cogvlm2 and reported an error:

root@gpu9:~/data/CogVLM2# python cogvlm_demo.py
2024-05-31 01:31:08,920 - lmdeploy - ERROR - TypeError: expected string or bytes-like object
2024-05-31 01:31:08,920 - lmdeploy - ERROR - test failed!
model /root/data/cogvlm2-llama3-chinese-chat-19B/ requires transformers version None but transformers 4.40.2 is installed.

my code：
from lmdeploy import pipeline
from lmdeploy.vl import load_image

model_path = '/root/data/cogvlm2-llama3-chinese-chat-19B/'

pipe = pipeline(model_path)

image = load_image('/root/data/dataset/misumi_data/images/Misumi000006.jpg')
response = pipe(('图中出现的零件是什么？', image))
print(response)

I look forward to your reply. Thank you

RunningLeon · 2024-06-03T02:22:15Z

@GuoXu-booo hi, because cogvlm is supported in pytorch engine and can you simply clone the code from pr and run pip install -e to install it. BTW, you better use the latest code from PR #1502. The check env part fails in your case as there's no transformers_version in the config.json, which is fixed in here

git clone --recursive -b support-cogvlm-dev https://github.com/RunningLeon/lmdeploy.git
cd lmdeploy 
pip install -e .

lvhan028 assigned RunningLeon May 21, 2024

RunningLeon mentioned this issue May 21, 2024

[Feature] Support for CogVLM2 #1628

Closed

RunningLeon added the planned feature label May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Implement COG-VLM2 #1622

[Feature] Implement COG-VLM2 #1622

isidentical commented May 20, 2024

RunningLeon commented May 21, 2024

Jayantverma2 commented May 23, 2024

RunningLeon commented May 24, 2024

RunningLeon commented May 28, 2024

Tushar-ml commented May 29, 2024

RunningLeon commented May 29, 2024 •

edited

Tushar-ml commented May 29, 2024

pseudotensor commented May 29, 2024

RunningLeon commented May 30, 2024

RunningLeon commented May 30, 2024

pseudotensor commented May 30, 2024

Tushar-ml commented May 30, 2024

RunningLeon commented May 30, 2024

Tushar-ml commented May 30, 2024

GuoXu-booo commented Jun 3, 2024

RunningLeon commented Jun 3, 2024

[Feature] Implement COG-VLM2 #1622

[Feature] Implement COG-VLM2 #1622

Comments

isidentical commented May 20, 2024

Motivation

Related resources

Additional context

RunningLeon commented May 21, 2024

Jayantverma2 commented May 23, 2024

RunningLeon commented May 24, 2024

RunningLeon commented May 28, 2024

Tushar-ml commented May 29, 2024

RunningLeon commented May 29, 2024 • edited

Tushar-ml commented May 29, 2024

pseudotensor commented May 29, 2024

RunningLeon commented May 30, 2024

RunningLeon commented May 30, 2024

pseudotensor commented May 30, 2024

Tushar-ml commented May 30, 2024

RunningLeon commented May 30, 2024

Tushar-ml commented May 30, 2024

GuoXu-booo commented Jun 3, 2024

RunningLeon commented Jun 3, 2024

RunningLeon commented May 29, 2024 •

edited