Proactive Query Expansion for Streaming Data Using External Source
Farah Alshanik, Amy Apon, Yuheng Du, Alexander Herzog, Ilya Safro

TL;DR
This paper presents an automated query expansion pipeline for streaming data, leveraging external sources and probabilistic models to improve information retrieval during emergent events.
Contribution
It introduces a novel method combining Dynamic Eigenvector Centrality and LDA for real-time query expansion using external data sources in streaming environments.
Findings
Enhanced retrieval of relevant tweets during Baltimore protests
Effective detection of emergent events through eigenvector centrality
Improved query relevance using external data sources
Abstract
Query expansion is the process of reformulating the original query by adding relevant words. Choosing which terms to add in order to improve the performance of the query expansion methods or to enhance the quality of the retrieved results is an important aspect of any information retrieval system. Adding words that can positively impact the quality of the search query or are informative enough play an important role in returning or gathering relevant documents that cover a certain topic can result in improving the efficiency of the information retrieval system. Typically, query expansion techniques are used to add or substitute words to a given search query to collect relevant data. In this paper, we design and implement a pipeline of automated query expansion. We outline several tools using different methods to expand the query. Our methods depend on targeting emergent events in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Data Management and Algorithms · Caching and Content Delivery
