PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models'   features for offensive language recognition

Piotr Janiszewski; Mateusz Skiba; Urszula Wali\'nska

arXiv:2010.01897·cs.CL·October 6, 2020

PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models' features for offensive language recognition

Piotr Janiszewski, Mateusz Skiba, Urszula Wali\'nska

PDF

TL;DR

This paper presents a method for offensive language recognition using aggregated features from fine-tuned Transformer models BERT and XLNet, achieving competitive results in SemEval-2020 tasks.

Contribution

The novel approach combines hidden layer features from BERT and XLNet for improved offensive language detection performance.

Findings

01

Achieved 64.727% macro F1-score in offense target identification

02

Ranked 7th out of 40 in Sub-task C

03

Achieved 89.726% F1-score in offensive language identification

Abstract

In this paper, we describe the PUM team's entry to the SemEval-2020 Task 12. Creating our solution involved leveraging two well-known pretrained models used in natural language processing: BERT and XLNet, which achieve state-of-the-art results in multiple NLP tasks. The models were fine-tuned for each subtask separately and features taken from their hidden layers were combined and fed into a fully connected neural network. The model using aggregated Transformer features can serve as a powerful tool for offensive language identification problem. Our team was ranked 7th out of 40 in Sub-task C - Offense target identification with 64.727% macro F1-score and 64th out of 85 in Sub-task A - Offensive language identification (89.726% F1-score).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dense Connections · Layer Normalization · Byte Pair Encoding · WordPiece · Multi-Head Attention · Dropout · Linear Warmup With Linear Decay