Bagging-Based Model Merging for Robust General Text Embeddings
Hengran Zhang, Keping Bi, Jiafeng Guo, Jiaming Zhang, Wenbo Yang, Daiting Shi, Xueqi Cheng

TL;DR
This paper systematically compares multi-task training strategies for text embeddings, finds batch-level shuffling most effective, and introduces BOOM, a model merging approach that enhances robustness and incremental learning efficiency.
Contribution
It introduces BOOM, a novel model merging method that improves robustness and supports efficient incremental updates for text embeddings.
Findings
Batch-level shuffling yields the best overall performance.
BOOM improves in-domain and out-of-domain performance.
BOOM reduces training costs in incremental learning.
Abstract
General-purpose text embedding models underpin a wide range of NLP and information retrieval applications, and are typically trained on large-scale multi-task corpora to encourage broad generalization. However, it remains unclear how different multi-task training strategies compare in practice, and how to efficiently adapt embedding models as new domains and data types continually emerge. In this work, we present a systematic study of multi-task training for text embeddings from two perspectives: data scheduling and model merging. We compare batch-level shuffling, sequential training variants, two-stage training, and multiple merging granularities, and find that simple batch-level shuffling consistently yields the strongest overall performance, suggesting that task conflicts are limited and training datasets are largely complementary. Despite its effectiveness, batch-level shuffling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Domain Adaptation and Few-Shot Learning
