A one stop repository for generative AI research updates, interview resources, notebooks and much more!
-
Updated
May 7, 2024
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
LAVIS - A One-stop Library for Language-Vision Intelligence
Creating a software for automatic monitoring in online proctoring
My Reading Lists of Deep Learning and Natural Language Processing
Oscar and VinVL
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Code for ALBEF: a new vision-language pre-training method
AI Research Platform for Reinforcement Learning from Real Panoramic Images.
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Multimodal-GPT
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
This repository contains my solutions to the assignments for Stanford's CS231n "Convolutional Neural Networks for Visual Recognition" (Spring 2020).
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
Add a description, image, and links to the vision-and-language topic page so that developers can more easily learn about it.
To associate your repository with the vision-and-language topic, visit your repo's landing page and select "manage topics."