Understanding the User: An Intent-Based Ranking Dataset

Abhijit Anand; Jurek Leonhardt; V Venktesh; Avishek Anand

arXiv:2408.17103·cs.IR·September 2, 2024

Understanding the User: An Intent-Based Ranking Dataset

Abhijit Anand, Jurek Leonhardt, V Venktesh, Avishek Anand

PDF

Open Access

TL;DR

This paper enhances web search datasets by using advanced language models and crowdsourcing to generate detailed query descriptions, improving the understanding of user intent for better evaluation of retrieval systems.

Contribution

It introduces a novel method combining LLMs and crowdsourcing to annotate query intent in benchmark datasets, enriching their utility for evaluation tasks.

Findings

01

Generated descriptions are validated through crowdsourcing.

02

Enhanced datasets provide richer context for ranking evaluation.

03

Method improves understanding of implicit user intent.

Abstract

As information retrieval systems continue to evolve, accurate evaluation and benchmarking of these systems become pivotal. Web search datasets, such as MS MARCO, primarily provide short keyword queries without accompanying intent or descriptions, posing a challenge in comprehending the underlying information need. This paper proposes an approach to augmenting such datasets to annotate informative query descriptions, with a focus on two prominent benchmark datasets: TREC-DL-21 and TREC-DL-22. Our methodology involves utilizing state-of-the-art LLMs to analyze and comprehend the implicit intent within individual queries from benchmark datasets. By extracting key semantic elements, we construct detailed and contextually rich descriptions for these queries. To validate the generated query descriptions, we employ crowdsourcing as a reliable means of obtaining diverse human perspectives on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques

MethodsSparse Evolutionary Training · Focus