Query Expansion Using Term Distribution and Term Association
Dipasree Pal, Mandar Mitra, Kalyankumar Datta

TL;DR
This paper explores combining distribution-based and association-based query expansion methods to enhance retrieval effectiveness, proposing a new combined approach and modifications to existing methods tested across multiple TREC collections.
Contribution
It introduces a novel combination of distribution and association-based query expansion techniques, along with modifications to LCA and Bo1 methods, improving retrieval performance.
Findings
Combined methods outperform individual approaches.
Proposed modifications lead to better retrieval results.
Effective across 11 TREC collections.
Abstract
Good term selection is an important issue for an automatic query expansion (AQE) technique. AQE techniques that select expansion terms from the target corpus usually do so in one of two ways. Distribution based term selection compares the distribution of a term in the (pseudo) relevant documents with that in the whole corpus / random distribution. Two well-known distribution-based methods are based on Kullback-Leibler Divergence (KLD) and Bose-Einstein statistics (Bo1). Association based term selection, on the other hand, uses information about how a candidate term co-occurs with the original query terms. Local Context Analysis (LCA) and Relevance-based Language Model (RM3) are examples of association-based methods. Our goal in this study is to investigate how these two classes of methods may be combined to improve retrieval effectiveness. We propose the following combination-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
