ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Tianqi Tang; Shohreh Deldari; Hao Xue; Celso De Melo; Flora D. Salim

arXiv:2406.13123·cs.AI·December 17, 2024

ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Tianqi Tang, Shohreh Deldari, Hao Xue, Celso De Melo, Flora D. Salim

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces ViLCo-Bench, a comprehensive benchmark for video-language continual learning, along with a memory-efficient framework that tackles challenges like long videos, complex language, and text-video misalignment.

Contribution

The study presents the first dedicated video-language continual learning benchmark and a novel framework that improves memory efficiency and handles complex video-text tasks.

Findings

01

ViLCo-Bench offers a more complex and realistic evaluation environment.

02

The proposed framework effectively manages long videos and open-ended language queries.

03

Experimental results demonstrate improved performance over existing methods.

Abstract

Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark, ViLCo-Bench, designed to evaluate continual learning models across a range of video-text tasks. The dataset comprises ten-minute-long videos and corresponding language queries collected from publicly available datasets. Additionally, we introduce a novel memory-efficient framework that incorporates self-supervised learning and mimics long-term and short-term memory effects. This framework addresses challenges including memory complexity from long video clips, natural language complexity from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cruiseresearchgroup/vilco
pytorchOfficial

Videos

ViLCo-Bench: VIdeo Language COntinual learning Benchmark· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems