multimodal

Here are 658 public repositories matching this topic...

LHees / shared_representations

neural-network bachelor-thesis dbm multimodal shared-representations

Updated Jul 8, 2020
Python

XuyingAI / MetaAgent

🤖 A framework for building AI Agents with LLMs, integrating multimodal generative AI technologies including voice, images, videos, and digital humans 🌈💎✨

agent ai multimodal digital-human stable-diffusion chatgpt

Updated Jul 31, 2023

Theodlz / bts-ml-ztf-summer-school

Star

A notebook to learn about ML for astronomy through BTSbot.

outreach machine-learning astronomy cnn supernovae multimodal

Updated Feb 7, 2024
Jupyter Notebook

jennafradin / textsense

Star

Visuo-haptic integration during texture exploration

touch vision haptics multimodal

Updated Jan 12, 2024
Processing

ArslanKAS / Open-Source-Models-with-Hugging-Face

Star

In this course, you’ll select open source models from Hugging Face Hub to perform NLP, audio, image and multimodal tasks using the Hugging Face transformers library.

nlp multimodal huggingface llm

Updated Mar 22, 2024
Jupyter Notebook

dvcarrillo / identity-space

Star

Collaborative generation of unique audiovisual experiences using NFC identity cards

angular ionic mobile-app web-application nfc generative-art multimodal

Updated Jan 20, 2021
TypeScript

paulinho-16 / MIEIC-PF

Star

Todo o conteúdo produzido para a unidade curricular PF (Projeto FEUP), para o curso em Engenharia Informática e Computação na FEUP

poster presentation report group-project interfaces multimodal

Updated Oct 11, 2021

BruteChouette81 / delta

Star

Multitasking multimodal AI material that focus on human interaction and assistance

deep-learning natural-language-understanding multimodal

Updated Apr 29, 2023
PureBasic

eliottcrancee / ParoleNet

Star

Utilizing a multimodal architecture to predict the appropriate speaker turn in a dialogue.

nlp deep-neural-networks deep-learning multimodal multimodal-deep-learning

Updated Feb 21, 2024
Python

xinke-wang / Multi-Modal-ML

Star

This repo collects Multi-modal Machine Learning papers.

machine-learning multimodal

Updated Jul 15, 2020

transportmodelling / CHAINBLD

Star

A multimodal transport chain builder

delphi multimodal transport-chain

Updated Jan 3, 2021
Pascal

cu-clear / Spatial-AMR

Star

AMR extension for the spatial domain, with grounded frame of reference tracking

annotation semantics amr srl multimodal spatial-relations frame-of-reference

Updated Oct 5, 2023

Danesed / Ducho

Star

Accepted at The Web Conference 2024.

deep-learning artificial-intelligence feature-extraction recommender-system multimodal multimodal-deep-learning

Updated Feb 6, 2024
Python

Nexdata-AI / 202-People-Multi-angle-Lip-Multimodal-Video-Data

Star

Multi-angle Lip Multimodal Video Data

deep-learning lipsync multimodal audio-image-classification

Updated Apr 18, 2024

chenyangjun45 / neuraltalk

Star

NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.

image-caption multimodal

Updated Dec 22, 2020
Python

felixgiov / VGP-Typology

Star

Dataset from the paper "The Semantic Typology of Visually Grounded Paraphrases"

nlp natural-language-processing dataset vision multimodal

Updated May 9, 2022

karen-pal / latina

Star

arts multimodal text2music musicgen

Updated Jun 9, 2023
HTML

BahtiBJ / HotelTest

Star

Application template for choosing a hotel and tour for travel

room kotlin-android kotlin-coroutines livedata koin multimodal adapter-delegate

Updated Nov 24, 2023
Kotlin

Morsinaldo / GAIND-Personalized-Real-Estate-Agent

Star

This repository contains the Personalized Real Estate Agent project implementation of Udacity's Generative AI NanoDegree

udacity-nanodegree multimodal real-estate-agent

Updated Mar 5, 2024
Jupyter Notebook

inferless / idefics-9b-instruct-8bit

Star

IDEFICS (Image-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionS) is an open-access reproduction of Flamingo, a closed-source visual language model developed by Deepmind. Like GPT-4, the multimodal model accepts arbitrary sequences of image and text inputs and produces text outputs.

multimodal

Updated Oct 16, 2023
Python

Improve this page

Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal

Here are 658 public repositories matching this topic...

LHees / shared_representations

XuyingAI / MetaAgent

Theodlz / bts-ml-ztf-summer-school

jennafradin / textsense

ArslanKAS / Open-Source-Models-with-Hugging-Face

dvcarrillo / identity-space

paulinho-16 / MIEIC-PF

BruteChouette81 / delta

eliottcrancee / ParoleNet

xinke-wang / Multi-Modal-ML

transportmodelling / CHAINBLD

cu-clear / Spatial-AMR

Danesed / Ducho

Nexdata-AI / 202-People-Multi-angle-Lip-Multimodal-Video-Data

chenyangjun45 / neuraltalk

felixgiov / VGP-Typology

karen-pal / latina

BahtiBJ / HotelTest

Morsinaldo / GAIND-Personalized-Real-Estate-Agent

inferless / idefics-9b-instruct-8bit

Improve this page

Add this topic to your repo