Automatic In-Domain Exemplar Construction and LLM-Based Refinement of Multi-LLM Expansions for Query Expansion

Minghan Li; Ercong Nie; Siqi Zhao; Tongna Chen; Huiping Huang; Guodong Zhou

arXiv:2602.08917·cs.IR·March 16, 2026

Automatic In-Domain Exemplar Construction and LLM-Based Refinement of Multi-LLM Expansions for Query Expansion

Minghan Li, Ercong Nie, Siqi Zhao, Tongna Chen, Huiping Huang, Guodong Zhou

PDF

Open Access

TL;DR

This paper introduces an automated, domain-adaptive query expansion framework using in-domain exemplars and a multi-LLM ensemble with refinement, achieving significant improvements over traditional methods across multiple datasets.

Contribution

It presents a novel, scalable QE approach that automatically constructs in-domain exemplars and leverages heterogeneous LLMs with a refinement step for improved query expansion.

Findings

01

Consistent and significant performance gains over baselines.

02

Effective domain adaptation without supervision.

03

Reproducible framework for exemplar selection and multi-LLM generation.

Abstract

Query expansion with large language models is promising but often relies on hand-crafted prompts, manually chosen exemplars, or a single LLM, making it non-scalable and sensitive to domain shift. We present an automated, domain-adaptive QE framework that builds in-domain exemplar pools by harvesting pseudo-relevant passages using a BM25-MonoT5 pipeline. A training-free cluster-based strategy selects diverse demonstrations, yielding strong and stable in-context QE without supervision. To further exploit model complementarity, we introduce a two-LLM ensemble in which two heterogeneous LLMs independently generate expansions and a refinement LLM consolidates them into one coherent expansion. Across TREC DL20, DBPedia, and SciFact, the refined ensemble delivers consistent and statistically significant gains over BM25, Rocchio, zero-shot, and fixed few-shot baselines. The framework offers a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks