Bagging-Based Model Merging for Robust General Text Embeddings

Hengran Zhang; Keping Bi; Jiafeng Guo; Jiaming Zhang; Wenbo Yang; Daiting Shi; Xueqi Cheng

arXiv:2602.05787·cs.IR·February 10, 2026

Bagging-Based Model Merging for Robust General Text Embeddings

Hengran Zhang, Keping Bi, Jiafeng Guo, Jiaming Zhang, Wenbo Yang, Daiting Shi, Xueqi Cheng

PDF

Open Access 2 Models

TL;DR

This paper systematically compares multi-task training strategies for text embeddings, finds batch-level shuffling most effective, and introduces BOOM, a model merging approach that enhances robustness and incremental learning efficiency.

Contribution

It introduces BOOM, a novel model merging method that improves robustness and supports efficient incremental updates for text embeddings.

Findings

01

Batch-level shuffling yields the best overall performance.

02

BOOM improves in-domain and out-of-domain performance.

03

BOOM reduces training costs in incremental learning.

Abstract

General-purpose text embedding models underpin a wide range of NLP and information retrieval applications, and are typically trained on large-scale multi-task corpora to encourage broad generalization. However, it remains unclear how different multi-task training strategies compare in practice, and how to efficiently adapt embedding models as new domains and data types continually emerge. In this work, we present a systematic study of multi-task training for text embeddings from two perspectives: data scheduling and model merging. We compare batch-level shuffling, sequential training variants, two-stage training, and multiple merging granularities, and find that simple batch-level shuffling consistently yields the strongest overall performance, suggesting that task conflicts are limited and training datasets are largely complementary. Despite its effectiveness, batch-level shuffling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Topic Modeling · Domain Adaptation and Few-Shot Learning