ur-iw-hnt at GermEval 2021: An Ensembling Strategy with Multiple BERT   Models

Hoai Nam Tran; Udo Kruschwitz

arXiv:2110.02042·cs.CL·October 6, 2021

ur-iw-hnt at GermEval 2021: An Ensembling Strategy with Multiple BERT Models

Hoai Nam Tran, Udo Kruschwitz

PDF

Open Access

TL;DR

This paper presents an ensembling approach using multiple BERT models for classifying comments in GermEval 2021, demonstrating that ensemble models outperform individual models across toxic, engaging, and fact-claiming comment detection tasks.

Contribution

The paper introduces an ensembling strategy combining diverse BERT models for comment classification, showing improved performance over single models.

Findings

01

Ensemble models outperform individual BERT models.

02

Twitter-based BERT models perform best among all models.

03

Multilingual models perform slightly worse than German and Twitter models.

Abstract

This paper describes our approach (ur-iw-hnt) for the Shared Task of GermEval2021 to identify toxic, engaging, and fact-claiming comments. We submitted three runs using an ensembling strategy by majority (hard) voting with multiple different BERT models of three different types: German-based, Twitter-based, and multilingual models. All ensemble models outperform single models, while BERTweet is the winner of all individual models in every subtask. Twitter-based models perform better than GermanBERT models, and multilingual models perform worse but by a small margin.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Text Readability and Simplification · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Layer · WordPiece · Adam · Attention Dropout · Residual Connection · Weight Decay · Dropout · Dense Connections · Softmax