Code for "Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning"
-
Updated
May 21, 2024
Code for "Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning"
Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
A curated list of awesome Multimodal studies.
The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.
An open-source implementation of LLaVA-NeXT.
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
Open Platform for Embodied Agents
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
A collection of resources on applications of multi-modal learning in medical imaging.
A Framework of Small-scale Large Multimodal Models
AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
Add a description, image, and links to the large-multimodal-models topic page so that developers can more easily learn about it.
To associate your repository with the large-multimodal-models topic, visit your repo's landing page and select "manage topics."