Re-Evaluating GermEval17 Using German Pre-Trained Language Models

M. A{\ss}enmacher; A. Corvonato; C. Heumann

arXiv:2102.12330·cs.CL·July 6, 2021

Re-Evaluating GermEval17 Using German Pre-Trained Language Models

M. A{\ss}enmacher, A. Corvonato, C. Heumann

PDF

Open Access 1 Repo

TL;DR

This paper assesses German pre-trained language models on GermEval17 tasks, highlighting the importance of language-specific benchmarks and comparing various architectures to understand transferability of NLP improvements.

Contribution

It provides a comprehensive evaluation of German BERT-based models on GermEval17, addressing the lack of standardized benchmarks for non-English NLP models.

Findings

01

German BERT models outperform pre-BERT architectures on GermEval17 tasks

02

Transferability of English NLP improvements to German remains uncertain

03

Multilingual models show competitive performance compared to language-specific models

Abstract

The lack of a commonly used benchmark data set (collection) such as (Super-)GLUE (Wang et al., 2018, 2019) for the evaluation of non-English pre-trained language models is a severe shortcoming of current English-centric NLP-research. It concentrates a large part of the research on English, neglecting the uncertainty when transferring conclusions found for the English language to other languages. We evaluate the performance of the German and multilingual BERT-based models currently available via the huggingface transformers library on the four tasks of the GermEval17 workshop. We compare them to pre-BERT architectures (Wojatzki et al., 2017; Schmitt et al., 2018; Attia et al., 2018) as well as to an ELMo-based architecture (Biesialska et al., 2020) and a BERT-based approach (Guhr et al., 2020). The observed improvements are put in relation to those for similar tasks and similar models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ac74/reevaluating_germeval2017
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification