Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text   Summarization

Mehrdad Farahani; Mohammad Gharachorloo; Mohammad Manthouri

arXiv:2012.11204·cs.CL·May 11, 2021

Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

Mehrdad Farahani, Mohammad Gharachorloo, Mohammad Manthouri

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces two transformer-based models, mT5 and ParsBERT, fine-tuned on a new Persian summarization dataset, achieving promising results and establishing a baseline for future Persian NLP research.

Contribution

It presents the first application of mT5 and ParsBERT for Persian abstractive summarization and introduces a new dataset, pn-summary, for this task.

Findings

01

Models achieved promising summarization performance.

02

First application of transformer models for Persian summarization.

03

Provides a new dataset for future research.

Abstract

Text summarization is one of the most critical Natural Language Processing (NLP) tasks. More and more researches are conducted in this field every day. Pre-trained transformer-based encoder-decoder models have begun to gain popularity for these tasks. This paper proposes two methods to address this task and introduces a novel dataset named pn-summary for Persian abstractive text summarization. The models employed in this paper are mT5 and an encoder-decoder version of the ParsBERT model (i.e., a monolingual BERT model for Persian). These models are fine-tuned on the pn-summary dataset. The current work is the first of its kind and, by achieving promising results, can serve as a baseline for any future work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hooshvare/pn-summary
noneOfficial

Datasets

HooshvareLab/pn_summary
dataset· 66 dl
66 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Byte Pair Encoding · Gated Linear Unit · SentencePiece · Adafactor · Inverse Square Root Schedule · T5 · mT5 · Linear Warmup With Linear Decay · Attention Dropout