A Unifying Scheme for Extractive Content Selection Tasks

Shmuel Amar; Ori Shapira; Aviv Slobodkin; Ido Dagan

arXiv:2507.16922·cs.CL·July 24, 2025

A Unifying Scheme for Extractive Content Selection Tasks

Shmuel Amar, Ori Shapira, Aviv Slobodkin, Ido Dagan

PDF

Open Access 1 Video

TL;DR

This paper introduces a unified instruction-guided framework for diverse extractive content selection tasks in NLP, along with a new benchmark and synthetic dataset to improve transfer learning and evaluation.

Contribution

It proposes a novel unified framework, introduces the first comprehensive benchmark, and creates a synthetic dataset to enhance transfer learning across content selection tasks.

Findings

01

Transfer learning improves performance across tasks.

02

Unified benchmark facilitates comparison of models.

03

Synthetic dataset boosts model generalization.

Abstract

A broad range of NLP tasks involve selecting relevant text spans from given source texts. Despite this shared objective, such \textit{content selection} tasks have traditionally been studied in isolation, each with its own modeling approaches, datasets, and evaluation metrics. In this work, we propose \textit{instruction-guided content selection (IGCS)} as a beneficial unified framework for such settings, where the task definition and any instance-specific request are encapsulated as instructions to a language model. To promote this framework, we introduce \igcsbench{}, the first unified benchmark covering diverse content selection tasks. Further, we create a large generic synthetic dataset that can be leveraged for diverse content selection tasks, and show that transfer learning with these datasets often boosts performance, whether dedicated training for the targeted task is available…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Unifying Scheme for Extractive Content Selection Tasks· underline

Taxonomy

TopicsNatural Language Processing Techniques · Rough Sets and Fuzzy Logic · Text and Document Classification Technologies