Topic Level Disambiguation for Weak Queries

Hui Zhang; Kiduk Yang; Elin Jacob

arXiv:1502.04823·cs.IR·February 18, 2015

Topic Level Disambiguation for Weak Queries

Hui Zhang, Kiduk Yang, Elin Jacob

PDF

TL;DR

This paper introduces a novel topic detection method leveraging Wikipedia's structure and language models to improve IR performance on weak, ambiguous queries, showing promising results over traditional disambiguation techniques.

Contribution

The study presents a new topic detection approach that combines language models and Wikipedia knowledge to enhance query disambiguation and retrieval accuracy for weak queries.

Findings

01

Topic detection improves IR results on weak queries.

02

Query disambiguation alone does not significantly enhance IR performance.

03

Wikipedia-based topic modeling outperforms traditional methods.

Abstract

Despite limited success, information retrieval (IR) systems today are not intelligent or reliable. IR systems return poor search results when users formulate their information needs into incomplete or ambiguous queries (i.e., weak queries). Therefore, one of the main challenges in modern IR research is to provide consistent results across all queries by improving the performance on weak queries. However, existing IR approaches such as query expansion are not overly effective because they make little effort to analyze and exploit the meanings of the queries. Furthermore, word sense disambiguation approaches, which rely on textual context, are ineffective against weak queries that are typically short. Motivated by the demand for a robust IR system that can consistently provide highly accurate results, the proposed study implemented a novel topic detection that leveraged both the language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.