New Methods & Metrics for LFQA tasks

Suchismit Mahapatra; Vladimir Blagojevic; Pablo Bertorello; Prasanna; Kumar

arXiv:2112.13432·cs.CL·December 28, 2021

New Methods & Metrics for LFQA tasks

Suchismit Mahapatra, Vladimir Blagojevic, Pablo Bertorello, Prasanna, Kumar

PDF

Open Access

TL;DR

This paper introduces new methods and metrics for LFQA tasks, addressing dataset overlap, lack of automatic evaluation metrics, and ungrounded answers, to improve the reliability and progress of long-form question answering systems.

Contribution

It proposes novel NLI/NLG methods and metrics specifically designed to tackle key challenges in LFQA, such as dataset overlap and answer grounding.

Findings

01

Reduced dataset overlap issues

02

Introduced automatic evaluation metrics for LFQA

03

Enhanced grounding of answers in retrieved documents

Abstract

Long-form question answering (LFQA) tasks require retrieving the documents pertinent to a query, using them to form a paragraph-length answer. Despite considerable progress in LFQA modeling, fundamental issues impede its progress: i) train/validation/test dataset overlap, ii) absence of automatic metrics and iii) generated answers not being "grounded" in retrieved documents. This work addresses every one these critical bottlenecks, contributing natural language inference/generation (NLI/NLG) methods and metrics that make significant strides to their alleviation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications