Clickbait Spoiling via Question Answering and Passage Retrieval

Matthias Hagen; Maik Fr\"obe; Artur Jurk; Martin Potthast

arXiv:2203.10282·cs.CL·March 22, 2022

Clickbait Spoiling via Question Answering and Passage Retrieval

Matthias Hagen, Maik Fr\"obe, Artur Jurk, Martin Potthast

PDF

Open Access 1 Repo

TL;DR

This paper addresses clickbait spoiling by classifying spoiler types and generating appropriate spoilers using question answering models, evaluated on a new large-scale dataset, achieving high accuracy and outperforming other models.

Contribution

It introduces the task of clickbait spoiling, develops classification and generation methods, and provides a large annotated dataset for evaluation.

Findings

01

Spoiler type classifier achieves 80% accuracy.

02

DeBERTa-large outperforms other models in spoiler generation.

03

Large-scale evaluation on 5,000 clickbait posts demonstrates effectiveness.

Abstract

We introduce and study the task of clickbait spoiling: generating a short text that satisfies the curiosity induced by a clickbait post. Clickbait links to a web page and advertises its contents by arousing curiosity instead of providing an informative summary. Our contributions are approaches to classify the type of spoiler needed (i.e., a phrase or a passage), and to generate appropriate spoilers. A large-scale evaluation and error analysis on a new corpus of 5,000 manually spoiled clickbait posts -- the Webis Clickbait Spoiling Corpus 2022 -- shows that our spoiler type classifier achieves an accuracy of 80%, while the question answering model DeBERTa-large outperforms all others in generating spoilers for both types.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

webis-de/acl-22
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Topic Modeling · Text Readability and Simplification