Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 2023)
-
Updated
Apr 23, 2024 - Python
Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 2023)
Given a video, we are able to automaticaly answer questions about what is happening in the video.
FreeVA: Offline MLLM as Training-Free Video Assistant
Data and PyTorch code for the LifeQA LREC 2020 paper.
Part of my work for my Bachelor's Thesis Project on Counterfactual Reasoning for Videos.
Code for ACL SustaiNLP 2023 paper "Is a Video worth n × n Images? A Highly Efficient Approach to Transformer-based Video Question Answering"
Code for ACL SRW 2023 paepr "Semantic-aware Dynamic Retrospective-Prospective Reasoning for Event-level Video Question Answering"
A simple attention deep learning model to answer questions about a given video with the most relevant video intervals as answers.
[TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”
LifeQA website code
[ICCV 2021] On the hidden treasure of dialog in video question answering
WildQA website code
[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
Multi-Scale Progressive Attention Network for Video Question Answering
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
A PyTorch implementation of EmpiricalMVM
[CVPR 2022] A large-scale public benchmark dataset for video question-answering, especially about evidence and commonsense reasoning. The code used in our paper "From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering", CVPR2022.
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
[ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
Add a description, image, and links to the video-question-answering topic page so that developers can more easily learn about it.
To associate your repository with the video-question-answering topic, visit your repo's landing page and select "manage topics."