An IR-based Evaluation Framework for Web Search Query Segmentation
Rishiraj Saha Roy, Niloy Ganguly, Monojit Choudhury, Srivatsan, Laxman

TL;DR
This paper introduces an IR-based evaluation framework for web search query segmentation, revealing that algorithm effectiveness often surpasses human annotations in IR performance, and provides insights for improving segmentation strategies.
Contribution
It presents the first IR-based evaluation framework for query segmentation, challenging traditional annotation-based validation and offering a new perspective on segmentation effectiveness.
Findings
State-of-the-art algorithms match or outperform human annotations in IR tasks.
The evaluation framework identifies key segments necessary for optimal retrieval.
The dataset used for evaluation is publicly available for research use.
Abstract
This paper presents the first evaluation framework for Web search query segmentation based directly on IR performance. In the past, segmentation strategies were mainly validated against manual annotations. Our work shows that the goodness of a segmentation algorithm as judged through evaluation against a handful of human annotated segmentations hardly reflects its effectiveness in an IR-based setup. In fact, state-of the-art algorithms are shown to perform as good as, and sometimes even better than human annotations -- a fact masked by previous validations. The proposed framework also provides us an objective understanding of the gap between the present best and the best possible segmentation algorithm. We draw these conclusions based on an extensive evaluation of six segmentation strategies, including three most recent algorithms, vis-a-vis segmentations from three human annotators.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Information Retrieval and Search Behavior · Advanced Image and Video Retrieval Techniques
