Learning to Reformulate the Queries on the WEB
Amir H. Jadidinejad

TL;DR
This paper presents an end-to-end neural network model that automatically reformulates web search queries using anchor phrase data, significantly improving retrieval performance.
Contribution
It introduces a novel character-level sequence-to-sequence model trained on large-scale anchor phrase data for query reformulation, advancing beyond traditional phrase-based methods.
Findings
Reformulated queries improve retrieval effectiveness.
The model leverages large-scale anchor phrase data.
Significant performance gains demonstrated on TREC collections.
Abstract
Inability of the naive users to formulate appropriate queries is a fundamental problem in web search engines. Therefore, assisting users to issue more effective queries is an important way to improve users' happiness. One effective approach is query reformulation, which generates new effective queries according to the current query issued by users. Previous researches typically generate words and phrases related to the original query. Since the definition of query reformulation is quite general, it is completely difficult to develop a uniform term-based approach for this problem. This paper uses readily available data, particularly over one billion anchor phrases in Clueweb09 corpus, in order to learn an end-to-end encoder-decoder model to automatically generate effective queries. Following successful researches in the field of sequence to sequence models, we employ a character-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Information Retrieval and Search Behavior · Topic Modeling
