Spoiler in a Textstack: How Much Can Transformers Help?

Anna Wr\'oblewska; Pawe{\l} Rzepi\'nski; Sylwia Sysko-Roma\'nczuk

arXiv:2112.12913·cs.CL·December 28, 2021

Spoiler in a Textstack: How Much Can Transformers Help?

Anna Wr\'oblewska, Pawe{\l} Rzepi\'nski, Sylwia Sysko-Roma\'nczuk

PDF

Open Access

TL;DR

This research explores the effectiveness of transformer-based models for spoiler detection in reviews, demonstrating high accuracy and interpretability on multiple datasets with novel annotations.

Contribution

It introduces a transfer learning approach with transformer architectures for spoiler detection and provides a new dataset with detailed annotations and interpretability analysis.

Findings

01

ROC AUC above 81% on TV Tropes Movies dataset

02

ROC AUC above 88% on Goodreads dataset

03

Effective interpretability techniques applied to model results

Abstract

This paper presents our research regarding spoiler detection in reviews. In this use case, we describe the method of fine-tuning and organizing the available text-based model tasks with the latest deep learning achievements and techniques to interpret the models' results. Until now, spoiler research has been rarely described in the literature. We tested the transfer learning approach and different latest transformer architectures on two open datasets with annotated spoilers (ROC AUC above 81\% on TV Tropes Movies dataset, and Goodreads dataset above 88\%). We also collected data and assembled a new dataset with fine-grained annotations. To that end, we employed interpretability techniques and measures to assess the models' reliability and explain their results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Computational and Text Analysis Methods · Topic Modeling