VDMA: Video Question Answering with Dynamically Generated Multi-Agents

Noriyuki Kugo; Tatsuya Ishibashi; Kosuke Ono; Yuji Sato

arXiv:2407.03610·cs.CV·July 8, 2024

VDMA: Video Question Answering with Dynamically Generated Multi-Agents

Noriyuki Kugo, Tatsuya Ishibashi, Kosuke Ono, Yuji Sato

PDF

Open Access

TL;DR

The paper introduces VDMA, a novel video question answering system that employs dynamically generated multi-agent experts to improve response accuracy and contextual relevance in video understanding tasks.

Contribution

It presents a new multi-agent framework with dynamically generated experts for video question answering, enhancing response accuracy over existing methods.

Findings

01

Demonstrated improved accuracy in video question answering tasks.

02

Showcased the effectiveness of dynamic agent generation in context understanding.

03

Provided detailed experimental results validating the approach.

Abstract

This technical report provides a detailed description of our approach to the EgoSchema Challenge 2024. The EgoSchema Challenge aims to identify the most appropriate responses to questions regarding a given video clip. In this paper, we propose Video Question Answering with Dynamically Generated Multi-Agents (VDMA). This method is a complementary approach to existing response generation systems by employing a multi-agent system with dynamically generated expert agents. This method aims to provide the most accurate and contextually appropriate responses. This report details the stages of our approach, the tools employed, and the results of our experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition