From Baseline to Top Performer: A Reproducibility Study of Approaches at the TREC 2021 Conversational Assistance Track
Weronika Lajewska, Krisztian Balog

TL;DR
This study reproduces and analyzes the effectiveness of TREC 2021 Conversational Assistance systems, revealing reproducibility challenges, and demonstrates how advanced retrieval techniques and dataset choices impact system performance.
Contribution
It provides a detailed reproducibility analysis of top TREC systems, identifies missing practical information, and explores the effects of different pipeline components on performance.
Findings
Reproducibility within 19% margin of original results
The performance gap between baseline and top system shrinks from 18% to 5%
Advanced retrieval techniques improve system effectiveness
Abstract
This paper reports on an effort of reproducing the organizers' baseline as well as the top performing participant submission at the 2021 edition of the TREC Conversational Assistance track. TREC systems are commonly regarded as reference points for effectiveness comparison. Yet, the papers accompanying them have less strict requirements than peer-reviewed publications, which can make reproducibility challenging. Our results indicate that key practical information is indeed missing. While the results can be reproduced within a 19% relative margin with respect to the main evaluation measure, the relative difference between the baseline and the top performing approach shrinks from the reported 18% to 5%. Additionally, we report on a new set of experiments aimed at understanding the impact of various pipeline components. We show that end-to-end system performance can indeed benefit from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Multimodal Machine Learning Applications
