TripJudge: A Relevance Judgement Test Collection for TripClick Health   Retrieval

Sophia Althammer; Sebastian Hofst\"atter; Suzan Verberne; Allan; Hanbury

arXiv:2208.06936·cs.IR·August 16, 2022

TripJudge: A Relevance Judgement Test Collection for TripClick Health Retrieval

Sophia Althammer, Sebastian Hofst\"atter, Suzan Verberne, Allan, Hanbury

PDF

1 Repo

TL;DR

This paper introduces TripJudge, a new relevance judgement test collection for TripClick health retrieval, addressing biases and coverage issues in previous click-based datasets, and demonstrating its impact on system evaluation.

Contribution

The paper presents TripJudge, a novel, human-annotated relevance test collection for TripClick, improving reliability and coverage over previous click-based datasets.

Findings

01

TripJudge improves relevance assessment quality.

02

Evaluation results differ significantly between click-based and judgement-based methods.

03

TripJudge enhances the reliability of health retrieval system evaluation.

Abstract

Robust test collections are crucial for Information Retrieval research. Recently there is a growing interest in evaluating retrieval systems for domain-specific retrieval tasks, however these tasks often lack a reliable test collection with human-annotated relevance assessments following the Cranfield paradigm. In the medical domain, the TripClick collection was recently proposed, which contains click log data from the Trip search engine and includes two click-based test sets. However the clicks are biased to the retrieval model used, which remains unknown, and a previous study shows that the test sets have a low judgement coverage for the Top-10 results of lexical and neural retrieval models. In this paper we present the novel, relevance judgement test collection TripJudge for TripClick health retrieval. We collect relevance judgements in an annotation campaign and ensure the quality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sophiaalthammer/tripjudge
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest