The Extractive-Abstractive Axis: Measuring Content "Borrowing" in   Generative Language Models

Nedelina Teneva

arXiv:2307.11779·cs.CL·July 25, 2023

The Extractive-Abstractive Axis: Measuring Content "Borrowing" in Generative Language Models

Nedelina Teneva

PDF

Open Access

TL;DR

This paper introduces the Extractive-Abstractive axis to evaluate and benchmark generative language models' tendency to produce content that is either extractive or abstractive, emphasizing the importance of developing appropriate metrics and datasets.

Contribution

It proposes a new axis for benchmarking LLMs' content generation style and highlights the need for specialized metrics, datasets, and annotation guidelines.

Findings

01

Highlights the importance of measuring content 'borrowing' in LLMs.

02

Calls for developing new metrics and datasets for better evaluation.

03

Focuses on text modality in content generation.

Abstract

Generative language models produce highly abstractive outputs by design, in contrast to extractive responses in search engines. Given this characteristic of LLMs and the resulting implications for content Licensing & Attribution, we propose the the so-called Extractive-Abstractive axis for benchmarking generative models and highlight the need for developing corresponding metrics, datasets and annotation guidelines. We limit our discussion to the text modality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies