Controlled Text Reduction

Aviv Slobodkin; Paul Roit; Eran Hirsch; Ori Ernst; Ido Dagan

arXiv:2210.13449·cs.CL·October 25, 2022·1 cites

Controlled Text Reduction

Aviv Slobodkin, Paul Roit, Eran Hirsch, Ori Ernst, Ido Dagan

PDF

Open Access 2 Repos 1 Models 1 Datasets

TL;DR

This paper introduces Controlled Text Reduction, a task focused on generating coherent summaries from pre-selected content, supported by new datasets and a baseline model for research and semi-automated summarization.

Contribution

It formalizes Controlled Text Reduction as a standalone task, creates new datasets, and develops a supervised baseline model for coherent text generation from highlighted content.

Findings

01

Crowdsourced high-quality datasets for the task.

02

Automatically generated larger training datasets from existing benchmarks.

03

Baseline model shows promising results and provides insights.

Abstract

Producing a reduced version of a source text, as in generic or focused summarization, inherently involves two distinct subtasks: deciding on targeted content and generating a coherent text conveying it. While some popular approaches address summarization as a single end-to-end task, prominent works support decomposed modeling for individual subtasks. Further, semi-automated text reduction is also very appealing, where users may identify targeted content while models would generate a corresponding coherent summary. In this paper, we focus on the second subtask, of generating coherent text given pre-selected content. Concretely, we formalize \textit{Controlled Text Reduction} as a standalone task, whose input is a source text with marked spans of targeted content ("highlighting"). A model then needs to generate a coherent text that includes all and only the target information. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
biu-nlp/led-base-controlled-text-reduction
model· 6 dl· ♡ 1
6 dl♡ 1

Datasets

biu-nlp/Controlled-Text-Reduction-dataset
dataset· 55 dl
55 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies

MethodsTest