Balance-aware Sequence Sampling Makes Multi-modal Learning Better
Zhi-Hao Guan

TL;DR
This paper introduces Balance-aware Sequence Sampling (BSS), a novel approach that improves multi-modal learning by dynamically selecting training samples based on their balance, addressing sequence-induced bias and enhancing robustness.
Contribution
The paper proposes a new sequence sampling method for multi-modal learning that considers sample balance and adapts dynamically during training, improving over existing methods.
Findings
BSS outperforms state-of-the-art MML methods on multiple datasets.
Dynamic sequence sampling reduces modality imbalance effectively.
The approach enhances model robustness and learning efficiency.
Abstract
To address the modality imbalance caused by data heterogeneity, existing multi-modal learning (MML) approaches primarily focus on balancing this difference from the perspective of optimization objectives. However, almost all existing methods ignore the impact of sample sequences, i.e., an inappropriate training order tends to trigger learning bias in the model, further exacerbating modality imbalance. In this paper, we propose Balance-aware Sequence Sampling (BSS) to enhance the robustness of MML. Specifically, we first define a multi-perspective measurer to evaluate the balance degree of each sample. Via the evaluation, we employ a heuristic scheduler based on curriculum learning (CL) that incrementally provides training subsets, progressing from balanced to imbalanced samples to rebalance MML. Moreover, considering that sample balance may evolve as the model capability increases, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Video Analysis and Summarization
MethodsFocus
