Clickbait Spoiling via Question Answering and Passage Retrieval
Matthias Hagen, Maik Fr\"obe, Artur Jurk, Martin Potthast

TL;DR
This paper addresses clickbait spoiling by classifying spoiler types and generating appropriate spoilers using question answering models, evaluated on a new large-scale dataset, achieving high accuracy and outperforming other models.
Contribution
It introduces the task of clickbait spoiling, develops classification and generation methods, and provides a large annotated dataset for evaluation.
Findings
Spoiler type classifier achieves 80% accuracy.
DeBERTa-large outperforms other models in spoiler generation.
Large-scale evaluation on 5,000 clickbait posts demonstrates effectiveness.
Abstract
We introduce and study the task of clickbait spoiling: generating a short text that satisfies the curiosity induced by a clickbait post. Clickbait links to a web page and advertises its contents by arousing curiosity instead of providing an informative summary. Our contributions are approaches to classify the type of spoiler needed (i.e., a phrase or a passage), and to generate appropriate spoilers. A large-scale evaluation and error analysis on a new corpus of 5,000 manually spoiled clickbait posts -- the Webis Clickbait Spoiling Corpus 2022 -- shows that our spoiler type classifier achieves an accuracy of 80%, while the question answering model DeBERTa-large outperforms all others in generating spoilers for both types.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Topic Modeling · Text Readability and Simplification
