ConStruct-VL: Data-Free Continual Structured VL Concepts Learning
James Seale Smith, Paola Cascante-Bonilla, Assaf Arbelle, Donghyun, Kim, Rameswar Panda, David Cox, Diyi Yang, Zsolt Kira, Rogerio Feris, Leonid, Karlinsky

TL;DR
ConStruct-VL introduces a new data-free continual learning benchmark for structured vision-and-language concepts, proposing a novel adversarial pseudo-replay method and a parameter-efficient neural architecture to improve model robustness and performance.
Contribution
The paper presents the first benchmark for data-free continual structured VL concept learning and proposes a novel adversarial pseudo-replay method with a parameter-efficient architecture.
Findings
Our method outperforms existing data-free strategies by up to 7%.
It matches some experience-replay methods while preserving data privacy.
The benchmark reveals significant challenges for current data-free continual learning approaches.
Abstract
Recently, large-scale pre-trained Vision-and-Language (VL) foundation models have demonstrated remarkable capabilities in many zero-shot downstream tasks, achieving competitive results for recognizing objects defined by as little as short text prompts. However, it has also been shown that VL models are still brittle in Structured VL Concept (SVLC) reasoning, such as the ability to recognize object attributes, states, and inter-object relations. This leads to reasoning mistakes, which need to be corrected as they occur by teaching VL models the missing SVLC skills; often this must be done using private data where the issue was found, which naturally leads to a data-free continual (no task-id) VL learning setting. In this work, we introduce the first Continual Data-Free Structured VL Concepts Learning (ConStruct-VL) benchmark and show it is challenging for many existing data-free CL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Multimodal Machine Learning Applications
