Playing with Words at the National Library of Sweden -- Making a Swedish BERT
Martin Malmsten, Love B\"orjeson, Chris Haffenden

TL;DR
This paper presents KB-BERT, a Swedish-specific BERT model trained on national library data, which outperforms existing models in various NLP tasks and is publicly available for research.
Contribution
We developed and trained a new Swedish BERT model using national library collections, demonstrating its superior performance over existing multilingual models.
Findings
KB-BERT outperforms other models in NER and POS tasks
The model is publicly available for research and further development
Challenges remain due to limited training data for smaller languages
Abstract
This paper introduces the Swedish BERT ("KB-BERT") developed by the KBLab for data-driven research at the National Library of Sweden (KB). Building on recent efforts to create transformer-based BERT models for languages other than English, we explain how we used KB's collections to create and train a new language-specific BERT model for Swedish. We also present the results of our model in comparison with existing models - chiefly that produced by the Swedish Public Employment Service, Arbetsf\"ormedlingen, and Google's multilingual M-BERT - where we demonstrate that KB-BERT outperforms these in a range of NLP tasks from named entity recognition (NER) to part-of-speech tagging (POS). Our discussion highlights the difficulties that continue to exist given the lack of training data and testbeds for smaller languages like Swedish. We release our model for further exploration and research…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Digital Humanities and Scholarship
MethodsLinear Layer · Multi-Head Attention · Residual Connection · Attention Is All You Need · Attention Dropout · Weight Decay · Adam · Softmax · WordPiece · Dense Connections
