BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran, Duong Minh Le, Dat Quoc Nguyen

TL;DR
BARTpho are the first large-scale pre-trained sequence-to-sequence models specifically designed for Vietnamese, outperforming mBART on text summarization and restoration tasks, and are publicly available for research and applications.
Contribution
Introduction of BARTpho, the first large-scale Vietnamese-specific pre-trained sequence-to-sequence models, improving generative NLP tasks over existing multilingual models.
Findings
BARTpho outperforms mBART in Vietnamese text summarization.
BARTpho is more effective in capitalization and punctuation restoration.
BARTpho models are publicly released for research use.
Abstract
We present BARTpho with two versions, BARTpho-syllable and BARTpho-word, which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese. BARTpho uses the "large" architecture and the pre-training scheme of the sequence-to-sequence denoising autoencoder BART, thus it is especially suitable for generative NLP tasks. We conduct experiments to compare our BARTpho with its competitor mBART on a downstream task of Vietnamese text summarization and show that: in both automatic and human evaluations, BARTpho outperforms the strong baseline mBART and improves the state-of-the-art. We further evaluate and compare BARTpho and mBART on the Vietnamese capitalization and punctuation restoration tasks and also find that BARTpho is more effective than mBART on these two tasks. We publicly release BARTpho to facilitate future research and applications of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗vinai/bartpho-syllablemodel· 132k dl· ♡ 7132k dl♡ 7
- 🤗vinai/bartpho-wordmodel· 2.6k dl· ♡ 62.6k dl♡ 6
- 🤗vinai/bartpho-syllable-basemodel· 203 dl· ♡ 1203 dl♡ 1
- 🤗vinai/bartpho-word-basemodel· 211 dl· ♡ 3211 dl♡ 3
- 🤗PhucDanh/Bartpho-fine-tuning-model-for-question-answeringmodel· 5 dl· ♡ 35 dl♡ 3
- 🤗tranviethuy01/vinai-bartpho-wordmodel· 1 dl1 dl
- 🤗tranviethuy01/vinai-bartpho-syllablemodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Denoising Autoencoder · Dense Connections · Multi-Head Attention · Byte Pair Encoding · Softmax · Dropout · Adam
