The Extractive-Abstractive Axis: Measuring Content "Borrowing" in Generative Language Models
Nedelina Teneva

TL;DR
This paper introduces the Extractive-Abstractive axis to evaluate and benchmark generative language models' tendency to produce content that is either extractive or abstractive, emphasizing the importance of developing appropriate metrics and datasets.
Contribution
It proposes a new axis for benchmarking LLMs' content generation style and highlights the need for specialized metrics, datasets, and annotation guidelines.
Findings
Highlights the importance of measuring content 'borrowing' in LLMs.
Calls for developing new metrics and datasets for better evaluation.
Focuses on text modality in content generation.
Abstract
Generative language models produce highly abstractive outputs by design, in contrast to extractive responses in search engines. Given this characteristic of LLMs and the resulting implications for content Licensing & Attribution, we propose the the so-called Extractive-Abstractive axis for benchmarking generative models and highlight the need for developing corresponding metrics, datasets and annotation guidelines. We limit our discussion to the text modality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
