Towards Proactive Information Retrieval in Noisy Text with Wikipedia Concepts
Tabish Ahmed, Sahan Bulathwela

TL;DR
This paper investigates how leveraging Wikipedia concepts and entity linking can enhance proactive information retrieval from noisy text, improving relevance detection and query disambiguation.
Contribution
It introduces two models that incorporate Wikipedia concepts into relevance ranking, demonstrating improved retrieval precision and query understanding in noisy text scenarios.
Findings
Wikipedia concepts provide a clear relevance signal.
Entity linking improves ranking precision.
Wikifying background context aids query disambiguation.
Abstract
Extracting useful information from the user history to clearly understand informational needs is a crucial feature of a proactive information retrieval system. Regarding understanding information and relevance, Wikipedia can provide the background knowledge that an intelligent system needs. This work explores how exploiting the context of a query using Wikipedia concepts can improve proactive information retrieval on noisy text. We formulate two models that use entity linking to associate Wikipedia topics with the relevance model. Our experiments around a podcast segment retrieval task demonstrate that there is a clear signal of relevance in Wikipedia concepts while a ranking model can improve precision by incorporating them. We also find Wikifying the background context of a query can help disambiguate the meaning of the query, further helping proactive information retrieval.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Topic Modeling · Natural Language Processing Techniques
