A codebase dedicated to exploring multimodal learning approaches by integrating images of host galaxies of supernovae and their corresponding light-curves and spectra.
-
Updated
May 28, 2024 - Jupyter Notebook
A codebase dedicated to exploring multimodal learning approaches by integrating images of host galaxies of supernovae and their corresponding light-curves and spectra.
Demo for Binding Text, Images, Graphs, and Audio for Music Representation Learning
Movie detection application.
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Audio, Image, Video, Music and 3D content. 🔥
Code for Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities
A curated list of awesome Multimodal studies.
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
FinRobot: An Open-Source AI Agent Platform for Financial Applications using LLMs 🚀 🚀 🚀
Pure C 3D Hybrid GAN using Cross attention, attention and convolution
Improving Chest X-Ray Report Generation by Leveraging Warm-Starting
Python framework to extract multimodal features for multimodal recommendation in a highly-customizable way.
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
A data science project to predict online pet adoption speed using image, natural language, and tabular data with a multi-modal ML framework.
Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances (ACL 2024)
Deep learning based content moderation from text, audio, video & image input modalities.
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Fine-tuning BLIP for pathological visual question answering.
Multimodal Pretraining for Unsupervised Protein Representation Learning
Corpus of resources for multimodal machine learning with physiological signals
Annotations on a Budget: Leveraging Geo-Data Similarity to Balance Model Performance and Annotation Cost
Add a description, image, and links to the multimodal-deep-learning topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-deep-learning topic, visit your repo's landing page and select "manage topics."