TL;DR
FiRA is a new dataset with detailed relevance annotations at passage and word levels, enabling better evaluation of document ranking and question answering models, and revealing insights into relevance distribution within long documents.
Contribution
Introduces FiRA, a dataset with fine-grained relevance annotations extending TREC 2019 data, facilitating improved evaluation of multi-task document ranking and QA models.
Findings
TKL model achieves state-of-the-art results on long documents
TKL misses many relevant passages despite strong overall performance
Relevance distribution varies across different positions in long documents
Abstract
There are many existing retrieval and question answering datasets. However, most of them either focus on ranked list evaluation or single-candidate question answering. This divide makes it challenging to properly evaluate approaches concerned with ranking documents and providing snippets or answers for a given query. In this work, we present FiRA: a novel dataset of Fine-Grained Relevance Annotations. We extend the ranked retrieval annotations of the Deep Learning track of TREC 2019 with passage and word level graded relevance annotations for all relevant documents. We use our newly created data to study the distribution of relevance in long documents, as well as the attention of annotators to specific positions of the text. As an example, we evaluate the recently introduced TKL document ranking model. We find that although TKL exhibits state-of-the-art retrieval results for long…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
