Lexical Repetitions Lead to Rote Learning: Unveiling the Impact of   Lexical Overlap in Train and Test Reference Summaries

Prafulla Kumar Choubey; Alexander R. Fabbri; Caiming Xiong and; Chien-Sheng Wu

arXiv:2311.09458·cs.CL·November 17, 2023·1 cites

Lexical Repetitions Lead to Rote Learning: Unveiling the Impact of Lexical Overlap in Train and Test Reference Summaries

Prafulla Kumar Choubey, Alexander R. Fabbri, Caiming Xiong and, Chien-Sheng Wu

PDF

Open Access

TL;DR

This paper investigates how lexical overlap between training and test summaries causes models to memorize data, leading to poor generalization, and proposes limiting lexical repetitions during training to enhance model robustness and novelty in summarization.

Contribution

It introduces a fine-grained evaluation protocol based on lexical similarity and demonstrates that limiting lexical repetitions during training reduces rote learning and improves generalization.

Findings

01

Significant performance gap based on lexical similarity levels.

02

Limiting lexical repetitions reduces factual errors and rote memorization.

03

Enhanced generalization on novel and recent news summaries.

Abstract

Ideal summarization models should generalize to novel summary-worthy content without remembering reference training summaries by rote. However, a single average performance score on the entire test set is inadequate in determining such model competencies. We propose a fine-grained evaluation protocol by partitioning a test set based on the lexical similarity of reference test summaries with training summaries. We observe up to a 5x (1.2x) difference in ROUGE-2 (entity recall) scores between the subsets with the lowest and highest similarity. Next, we show that such training repetitions also make a model vulnerable to rote learning, reproducing data artifacts such as factual errors, especially when reference test summaries are lexically close to training summaries. Consequently, we propose to limit lexical repetitions in training summaries during both supervised fine-tuning and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Healthcare

MethodsSparse Evolutionary Training