AMEGO: Active Memory from long EGOcentric videos
Gabriele Goletto, Tushar Nagarajan, Giuseppe Averta, Dima Damen

TL;DR
AMEGO is a novel method for understanding long egocentric videos by creating self-contained, semantic-free representations that enable efficient querying and reasoning, supported by a new challenging benchmark.
Contribution
The paper introduces AMEGO, a new approach for egocentric video comprehension, and the Active Memories Benchmark (AMB) for evaluating detailed video reasoning.
Findings
AMEGO outperforms existing video QA baselines on AMB.
The approach effectively captures key locations and object interactions in long videos.
The benchmark includes over 20,000 challenging visual queries.
Abstract
Egocentric videos provide a unique perspective into individuals' daily experiences, yet their unstructured nature presents challenges for perception. In this paper, we introduce AMEGO, a novel approach aimed at enhancing the comprehension of very-long egocentric videos. Inspired by the human's ability to maintain information from a single watching, AMEGO focuses on constructing a self-contained representations from one egocentric video, capturing key locations and object interactions. This representation is semantic-free and facilitates multiple queries without the need to reprocess the entire visual content. Additionally, to evaluate our understanding of very-long egocentric videos, we introduce the new Active Memories Benchmark (AMB), composed of more than 20K of highly challenging visual queries from EPIC-KITCHENS. These queries cover different levels of video reasoning (sequencing,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Artificial Intelligence in Games · Video Analysis and Summarization
