#
multimodal-deep-learning
Here are
349 public repositories
matching this topic...
A Fully Deployable React-Native mobile app that seeks to classify incoming messages in messaging apps into important or disturbing categories. using a Multi-Modal Machine Learning Architecture to achieve Text classification, Image classification and YouTube Video Link classification.
Updated
May 10, 2022
Jupyter Notebook
Analyzing Hateful Memes/ (Resources:- Hateful Memes Challenge)
Updated
Feb 18, 2024
Jupyter Notebook
PyTorch Data loaders and abstraction for multi-modal data.
Updated
Dec 27, 2022
Jupyter Notebook
Deep Learning for Music & Audio - Multi modal project
Updated
May 16, 2022
Jupyter Notebook
A multimodal that uses both text and Images to tells what will be the expected emotion of the viewer of the news.
Updated
Aug 12, 2022
Jupyter Notebook
The purpose of this project is to build an NLP model to make reading medical abtracts easier.
Updated
Aug 20, 2023
Jupyter Notebook
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Audio, Image, Video, Music and 3D content. 🔥
Showcases ongoing, and completed projects within various research themes.
Leveraging Meteorological data and All-Sky Images to create a multimodal model for better forecasting of Solar Irradiance parameters.
Updated
Dec 24, 2023
Jupyter Notebook
Accepted at The Web Conference 2024.
Updated
Feb 6, 2024
Python
Deeplearning utils for multimodal research
Updated
Jul 28, 2023
Python
Code and Models for Binding Text, Images, Graphs, and Audio for Music Representation Learning
Updated
May 18, 2024
Python
Example of a multimodal (end-to-end) deep learning model with transformers architecture
Project to transform a natural language description into an image using Generative Adversarial Networks.
Updated
Dec 9, 2017
Python
Final project for CS 7643 : Deep Learning (Fall 2022, Georgia Tech)
Updated
Jan 9, 2023
Jupyter Notebook
A novel multimodal approach for emotion recognition deploying early fusion based on graph-captured embeddings
Updated
Jan 3, 2024
Jupyter Notebook
Learning a common representation space from speech and text for cross-modal retrieval given textual queries and speech files.
Updated
Apr 27, 2023
Python
This code is part of the paper: "A Deep Dive Into Neural Synchrony Evaluation for Audio-visual Translation" published at ACM ICMI 2022.
Updated
Apr 29, 2023
Python
Code for the paper "Exploring the Synergy Between Vision-Language Pretraining and ChatGPT for Artwork Captioning: A Preliminary Study"
Updated
Jan 21, 2024
Jupyter Notebook
Improve this page
Add a description, image, and links to the
multimodal-deep-learning
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
multimodal-deep-learning
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.