EstBERT: A Pretrained Language-Specific BERT for Estonian
Hasan Tanvir, Claudia Kittask, Sandra Eiche, Kairit Sirts

TL;DR
EstBERT is a new Estonian-specific BERT model that outperforms multilingual BERT on most NLP tasks, demonstrating the value of language-specific models for improved performance.
Contribution
This paper introduces EstBERT, a pretrained language-specific BERT for Estonian, and demonstrates its superior performance over multilingual BERT on various NLP tasks.
Findings
EstBERT outperforms multilingual BERT on five of six NLP tasks.
Language-specific BERT models provide significant benefits over multilingual models.
EstBERT improves performance in POS tagging, NER, and text classification.
Abstract
This paper presents EstBERT, a large pretrained transformer-based language-specific BERT model for Estonian. Recent work has evaluated multilingual BERT models on Estonian tasks and found them to outperform the baselines. Still, based on existing studies on other languages, a language-specific BERT model is expected to improve over the multilingual ones. We first describe the EstBERT pretraining process and then present the results of the models based on finetuned EstBERT for multiple NLP tasks, including POS and morphological tagging, named entity recognition and text classification. The evaluation results show that the models based on EstBERT outperform multilingual BERT models on five tasks out of six, providing further evidence towards a view that training language-specific BERT models are still useful, even when multilingual models are available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
MethodsLinear Layer · Dropout · Attention Dropout · Softmax · Multi-Head Attention · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · WordPiece · Layer Normalization
