ParsBERT: Transformer-based Model for Persian Language Understanding

Mehrdad Farahani; Mohammad Gharachorloo; Marzieh Farahani; Mohammad; Manthouri

arXiv:2005.12515·cs.CL·October 12, 2021

ParsBERT: Transformer-based Model for Persian Language Understanding

Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad, Manthouri

PDF

3 Repos 9 Models

TL;DR

ParsBERT is a monolingual transformer-based model specifically designed for Persian, achieving state-of-the-art results across multiple NLP tasks by leveraging a large Persian dataset.

Contribution

This paper introduces ParsBERT, a Persian-specific BERT model that outperforms multilingual models and prior approaches in various NLP tasks.

Findings

01

ParsBERT achieves higher scores than multilingual BERT.

02

It improves state-of-the-art in Sentiment Analysis, Text Classification, NER.

03

Utilizes a large, diverse Persian dataset for pre-training.

Abstract

The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones and improves the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Weight Decay · Softmax · Adam · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · Dense Connections