Evaluating Elements of Web-based Data Enrichment for Pseudo-Relevance   Feedback Retrieval

Timo Breuer; Melanie Pest; Philipp Schaer

arXiv:2203.05420·cs.IR·March 11, 2022

Evaluating Elements of Web-based Data Enrichment for Pseudo-Relevance Feedback Retrieval

Timo Breuer, Melanie Pest, Philipp Schaer

PDF

1 Repo

TL;DR

This paper evaluates a web-based data enrichment method for pseudo-relevance feedback in information retrieval, analyzing its robustness and effectiveness across different search engines, queries, and test collections.

Contribution

It provides a comprehensive analysis of web content-based data enrichment for relevance feedback, extending prior work with systematic experiments on system performance over time.

Findings

01

The method is robust across various conditions.

02

Web content enrichment improves retrieval performance.

03

Performance varies with search engine and query type.

Abstract

In this work, we analyze a pseudo-relevance retrieval method based on the results of web search engines. By enriching topics with text data from web search engine result pages and linked contents, we train topic-specific and cost-efficient classifiers that can be used to search test collections for relevant documents. Building upon attempts initially made at TREC Common Core 2018 by Grossman and Cormack, we address questions of system performance over time considering different search engines, queries, and test collections. Our experimental results show how and to which extent the considered components affect the retrieval performance. Overall, the analyzed method is robust in terms of average retrieval performance and a promising way to use web content for the data enrichment of relevance feedback methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

irgroup/clef2021-web-prf
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.