Retrieving Contextual Information for Long-Form Question Answering using   Weak Supervision

Philipp Christmann; Svitlana Vakulenko; Ionut Teodor Sorodoc; Bill; Byrne; Adri\`a de Gispert

arXiv:2410.08623·cs.CL·October 14, 2024

Retrieving Contextual Information for Long-Form Question Answering using Weak Supervision

Philipp Christmann, Svitlana Vakulenko, Ionut Teodor Sorodoc, Bill, Byrne, Adri\`a de Gispert

PDF

Open Access 1 Video

TL;DR

This paper introduces weak supervision techniques to improve retrieval of contextual information for long-form question answering, enhancing answer relevance and groundedness, and addressing the lack of training data for context retrieval.

Contribution

It proposes novel weak supervision methods for better contextual retrieval in LFQA and demonstrates significant improvements in answer quality and relevant page recall.

Findings

01

14.7% increase in relevant page recall

02

12.5% improvement in groundedness of answers

03

Enhanced ability to anticipate follow-up questions

Abstract

Long-form question answering (LFQA) aims at generating in-depth answers to end-user questions, providing relevant information beyond the direct answer. However, existing retrievers are typically optimized towards information that directly targets the question, missing out on such contextual information. Furthermore, there is a lack of training data for relevant context. To this end, we propose and compare different weak supervision techniques to optimize retrieval for contextual information. Experiments demonstrate improvements on the end-to-end QA performance on ASQA, a dataset for long-form question answering. Importantly, as more contextual information is retrieved, we improve the relevant page recall for LFQA by 14.7% and the groundedness of generated long-form answers by 12.5%. Finally, we show that long-form answers often anticipate likely follow-up questions, via experiments on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Retrieving Contextual Information for Long-Form Question Answering using Weak Supervision· underline

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems