DEXTER: A Benchmark for open-domain Complex Question Answering using LLMs
Venktesh V. Deepali Prabhu, Avishek Anand

TL;DR
This paper introduces DEXTER, a comprehensive benchmark for open-domain complex question answering that evaluates retrieval and reasoning capabilities of models, revealing current limitations and guiding future improvements.
Contribution
It presents a new benchmark with diverse complex QA tasks and a toolkit for evaluating retrieval models and LLM reasoning in open-domain settings.
Findings
Late interaction and lexical models like BM25 perform well in retrieval.
Significant room for improvement exists in retrieval to enhance QA performance.
LLMs' reasoning capabilities are heavily influenced by retrieval quality.
Abstract
Open-domain complex Question Answering (QA) is a difficult task with challenges in evidence retrieval and reasoning. The complexity of such questions could stem from questions being compositional, hybrid evidence, or ambiguity in questions. While retrieval performance for classical QA tasks is well explored, their capabilities for heterogeneous complex retrieval tasks, especially in an open-domain setting, and the impact on downstream QA performance, are relatively unexplored. To address this, in this work, we propose a benchmark composing diverse complex QA tasks and provide a toolkit to evaluate state-of-the-art pre-trained dense and sparse retrieval models in an open-domain setting. We observe that late interaction models and surprisingly lexical models like BM25 perform well compared to other pre-trained dense retrieval models. In addition, since context-based reasoning is critical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
