Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via   Dynamic Data Selection

Adyasha Maharana; Jaehong Yoon; Tianlong Chen; Mohit Bansal

arXiv:2410.10636·cs.LG·March 25, 2025

Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection

Adyasha Maharana, Jaehong Yoon, Tianlong Chen, Mohit Bansal

PDF

Open Access 1 Repo

TL;DR

Adapt-∞ introduces a scalable data selection method for continual multimodal instruction tuning, enabling models to efficiently learn new skills while retaining previous knowledge by dynamically selecting and pruning training samples.

Contribution

The paper proposes Adapt-∞, a novel adaptive data selection approach that dynamically balances efficiency and effectiveness in lifelong multimodal instruction tuning.

Findings

01

Reduces dataset size while maintaining performance

02

Alleviates catastrophic forgetting in multimodal models

03

Promotes forward transfer across tasks

Abstract

Visual instruction datasets from various distributors are released at different times and often contain a significant number of semantically redundant text-image pairs, depending on their task compositions (i.e., skills) or reference sources. This redundancy greatly limits the efficient deployment of continually adaptable multimodal large language models, hindering their ability to refine existing skills and acquire new competencies over time. We reframe the problem of lifelong Instruction Tuning (LiIT) via data selection, where the model automatically selects beneficial samples to learn from earlier and new datasets based on the current state of acquired knowledge in the model. We propose Adapt- $\infty$ , a new multi-way and adaptive data selection approach that dynamically balances sample efficiency and effectiveness during LiIT. We first construct pseudo-skill clusters by grouping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adymaharana/adapt-inf
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsPruning