BERTje: A Dutch BERT Model
Wietse de Vries, Andreas van Cranenburgh, Arianna Bisazza, Tommaso, Caselli, Gertjan van Noord, Malvina Nissim

TL;DR
BERTje is a Dutch-specific BERT model trained on a large diverse dataset, outperforming multilingual BERT on multiple NLP tasks and publicly available for research use.
Contribution
This paper introduces BERTje, a monolingual Dutch BERT model trained on 2.4 billion tokens, demonstrating improved performance over multilingual BERT on various NLP tasks.
Findings
BERTje outperforms multilingual BERT on Dutch NLP tasks
BERTje is trained on a larger, more diverse dataset
The model is publicly available for research use
Abstract
The transformer-based pre-trained language model BERT has helped to improve state-of-the-art performance on many natural language processing (NLP) tasks. Using the same architecture and parameters, we developed and evaluated a monolingual Dutch BERT model called BERTje. Compared to the multilingual BERT model, which includes Dutch but is only based on Wikipedia text, BERTje is based on a large and diverse dataset of 2.4 billion tokens. BERTje consistently outperforms the equally-sized multilingual BERT model on downstream NLP tasks (part-of-speech tagging, named-entity recognition, semantic role labeling, and sentiment analysis). Our pre-trained Dutch BERT model is made available at https://github.com/wietsedv/bertje.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax
