BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Nguyen Luong Tran; Duong Minh Le; Dat Quoc Nguyen

arXiv:2109.09701·cs.CL·June 28, 2022·6 cites

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Nguyen Luong Tran, Duong Minh Le, Dat Quoc Nguyen

PDF

Open Access 3 Repos 7 Models

TL;DR

BARTpho are the first large-scale pre-trained sequence-to-sequence models specifically designed for Vietnamese, outperforming mBART on text summarization and restoration tasks, and are publicly available for research and applications.

Contribution

Introduction of BARTpho, the first large-scale Vietnamese-specific pre-trained sequence-to-sequence models, improving generative NLP tasks over existing multilingual models.

Findings

01

BARTpho outperforms mBART in Vietnamese text summarization.

02

BARTpho is more effective in capitalization and punctuation restoration.

03

BARTpho models are publicly released for research use.

Abstract

We present BARTpho with two versions, BARTpho-syllable and BARTpho-word, which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese. BARTpho uses the "large" architecture and the pre-training scheme of the sequence-to-sequence denoising autoencoder BART, thus it is especially suitable for generative NLP tasks. We conduct experiments to compare our BARTpho with its competitor mBART on a downstream task of Vietnamese text summarization and show that: in both automatic and human evaluations, BARTpho outperforms the strong baseline mBART and improves the state-of-the-art. We further evaluate and compare BARTpho and mBART on the Vietnamese capitalization and punctuation restoration tasks and also find that BARTpho is more effective than mBART on these two tasks. We publicly release BARTpho to facilitate future research and applications of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Denoising Autoencoder · Dense Connections · Multi-Head Attention · Byte Pair Encoding · Softmax · Dropout · Adam