TL;DR
ParsBERT is a monolingual transformer-based model specifically designed for Persian, achieving state-of-the-art results across multiple NLP tasks by leveraging a large Persian dataset.
Contribution
This paper introduces ParsBERT, a Persian-specific BERT model that outperforms multilingual models and prior approaches in various NLP tasks.
Findings
ParsBERT achieves higher scores than multilingual BERT.
It improves state-of-the-art in Sentiment Analysis, Text Classification, NER.
Utilizes a large, diverse Persian dataset for pre-training.
Abstract
The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones and improves the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗ForutanRad/bert-fa-QA-v1model· 4 dl· ♡ 24 dl♡ 2
- 🤗HooshvareLab/bert-base-parsbert-armanner-uncasedmodel· 65 dl· ♡ 365 dl♡ 3
- 🤗HooshvareLab/bert-base-parsbert-ner-uncasedmodel· 508 dl· ♡ 5508 dl♡ 5
- 🤗HooshvareLab/bert-base-parsbert-peymaner-uncasedmodel· 11 dl11 dl
- 🤗HooshvareLab/bert-base-parsbert-uncasedmodel· 681 dl· ♡ 46681 dl♡ 46
- 🤗HooshvareLab/bert-fa-base-uncasedmodel· 2.8k dl· ♡ 322.8k dl♡ 32
- 🤗HooshvareLab/bert-fa-zwnj-basemodel· 68 dl· ♡ 1868 dl♡ 18
- 🤗SajjadAyoubi/distil-bigbird-fa-zwnjmodel· 7 dl7 dl
- 🤗pedramyazdipoor/parsbert_question_answering_PQuADmodel· 12 dl· ♡ 112 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Weight Decay · Softmax · Adam · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · Dense Connections
