Enhancing Pre-Trained Generative Language Models with Question Attended   Span Extraction on Machine Reading Comprehension

Lin Ai; Zheng Hui; Zizhou Liu; Julia Hirschberg

arXiv:2404.17991·cs.CL·October 17, 2024

Enhancing Pre-Trained Generative Language Models with Question Attended Span Extraction on Machine Reading Comprehension

Lin Ai, Zheng Hui, Zizhou Liu, Julia Hirschberg

PDF

Open Access

TL;DR

This paper introduces the QASE module, which enhances pre-trained generative language models for machine reading comprehension by improving their span extraction ability, outperforming some large models without extra computational costs.

Contribution

The paper presents a novel QASE module integrated during fine-tuning that significantly boosts generative models' extractive performance in MRC tasks, surpassing some large language models.

Findings

01

QASE improves generative models' accuracy on MRC datasets.

02

Enhanced models outperform GPT-4 in few-shot settings.

03

Performance gains are achieved without increased computational costs.

Abstract

Machine Reading Comprehension (MRC) poses a significant challenge in the field of Natural Language Processing (NLP). While mainstream MRC methods predominantly leverage extractive strategies using encoder-only models such as BERT, generative approaches face the issue of out-of-control generation -- a critical problem where answers generated are often incorrect, irrelevant, or unfaithful to the source text. To address these limitations in generative models for MRC, we introduce the Question-Attended Span Extraction (QASE) module. Integrated during the fine-tuning phase of pre-trained generative language models (PLMs), QASE significantly enhances their performance, allowing them to surpass the extractive capabilities of advanced Large Language Models (LLMs) such as GPT-4 in few-shot settings. Notably, these gains in performance do not come with an increase in computational demands. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Linear Layer · Dense Connections · Label Smoothing · Linear Warmup With Linear Decay · Weight Decay