FH-SWF SG at GermEval 2021: Using Transformer-Based Language Models to   Identify Toxic, Engaging, & Fact-Claiming Comments

Christian Gawron; Sebastian Schmidt

arXiv:2109.02966·cs.CL·September 16, 2021

FH-SWF SG at GermEval 2021: Using Transformer-Based Language Models to Identify Toxic, Engaging, & Fact-Claiming Comments

Christian Gawron, Sebastian Schmidt

PDF

Open Access

TL;DR

This paper explores fine-tuning transformer-based language models to identify toxic, engaging, and fact-claiming comments, achieving notable performance improvements in a shared task setting.

Contribution

It demonstrates the effectiveness of fine-tuning pre-trained transformers for multi-faceted comment classification tasks.

Findings

01

Best performance on fact-claiming subtask with F1-score of 0.736

02

Transformer models outperform baseline methods

03

Hyperparameter tuning improves model accuracy

Abstract

In this paper we describe the methods we used for our submissions to the GermEval 2021 shared task on the identification of toxic, engaging, and fact-claiming comments. For all three subtasks we fine-tuned freely available transformer-based models from the Huggingface model hub. We evaluated the performance of various pre-trained models after fine-tuning on 80% of the training data with different hyperparameters and submitted predictions of the two best performing resulting models. We found that this approach worked best for subtask 3, for which we achieved an F1-score of 0.736.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification