A Neural Model for Joint Document and Snippet Ranking in Question Answering for Large Document Collections
Dimitris Pappas, Ion Androutsopoulos

TL;DR
This paper introduces a neural architecture for joint document and snippet ranking in question answering systems, significantly improving snippet retrieval accuracy over traditional pipeline approaches in biomedical and natural question datasets.
Contribution
The paper proposes a novel joint ranking model that outperforms pipeline methods, using fewer parameters and applicable with various neural rankers, demonstrated on biomedical and natural question datasets.
Findings
Joint models outperform pipelines in snippet retrieval
Fewer trainable parameters needed for joint models
Competitive document retrieval performance
Abstract
Question answering (QA) systems for large document collections typically use pipelines that (i) retrieve possibly relevant documents, (ii) re-rank them, (iii) rank paragraphs or other snippets of the top-ranked documents, and (iv) select spans of the top-ranked snippets as exact answers. Pipelines are conceptually simple, but errors propagate from one component to the next, without later components being able to revise earlier decisions. We present an architecture for joint document and snippet ranking, the two middle stages, which leverages the intuition that relevant documents have good snippets and good snippets come from relevant documents. The architecture is general and can be used with any neural text relevance ranker. We experiment with two main instantiations of the architecture, based on POSIT-DRMM (PDRMM) and a BERT-based ranker. Experiments on biomedical data from BIOASQ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
