ARC-NLP at PAN 2023: Hierarchical Long Text Classification for Trigger Detection
Umitcan Sahin, Izzet Emre Kucukkaya, Cagri Toraman

TL;DR
This paper presents a hierarchical long text classification method for detecting triggering content in fanfiction, combining Transformer fine-tuning and LSTM models to improve detection accuracy.
Contribution
The authors introduce a novel hierarchical model that integrates Transformer-based language models with LSTM for multi-label trigger detection in long texts.
Findings
Achieved F1-macro score of 0.372 on validation set
Achieved F1-micro score of 0.736 on validation set
Outperformed baseline results at PAN CLEF 2023
Abstract
Fanfiction, a popular form of creative writing set within established fictional universes, has gained a substantial online following. However, ensuring the well-being and safety of participants has become a critical concern in this community. The detection of triggering content, material that may cause emotional distress or trauma to readers, poses a significant challenge. In this paper, we describe our approach for the Trigger Detection shared task at PAN CLEF 2023, where we want to detect multiple triggering content in a given Fanfiction document. For this, we build a hierarchical model that uses recurrence over Transformer-based language models. In our approach, we first split long documents into smaller sized segments and use them to fine-tune a Transformer model. Then, we extract feature embeddings from the fine-tuned Transformer model, which are used as input in the training of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComics and Graphic Narratives · Natural Language Processing Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Tanh Activation · Byte Pair Encoding · Linear Layer · Softmax · Layer Normalization · Dense Connections · Dropout · Sigmoid Activation
