SwissBERT: The Multilingual Language Model for Switzerland
Jannis Vamvas, Johannes Gra\"en, Rico Sennrich

TL;DR
SwissBERT is a specialized multilingual language model designed for Swiss-related texts, outperforming previous models on Swiss language understanding tasks, especially in news and Romansh contexts.
Contribution
It introduces SwissBERT, a pre-trained masked language model tailored for Swiss languages, with language adapters enabling future dialect extensions.
Findings
SwissBERT outperforms previous models on Swiss language tasks.
It performs well on contemporary news and Romansh texts.
The model and code are publicly available.
Abstract
We present SwissBERT, a masked language model created specifically for processing Switzerland-related text. SwissBERT is a pre-trained model that we adapted to news articles written in the national languages of Switzerland -- German, French, Italian, and Romansh. We evaluate SwissBERT on natural language understanding tasks related to Switzerland and find that it tends to outperform previous models on these tasks, especially when processing contemporary news and/or Romansh Grischun. Since SwissBERT uses language adapters, it may be extended to Swiss German dialects in future work. The model and our open-source code are publicly released at https://github.com/ZurichNLP/swissbert.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGender Studies in Language · Natural Language Processing Techniques · Linguistic research and analysis
