ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language   Generation

Long Phan; Hieu Tran; Hieu Nguyen; Trieu H. Trinh

arXiv:2205.06457·cs.CL·May 27, 2022·1 cites

ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation

Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh

PDF

Open Access 1 Repo 2 Models

TL;DR

ViT5 is a pretrained Transformer model tailored for Vietnamese language tasks, achieving state-of-the-art results in text summarization and competitive performance in named entity recognition, highlighting the importance of context length during training.

Contribution

This work introduces ViT5, a Vietnamese-specific pretrained encoder-decoder Transformer model, and demonstrates its effectiveness on key language generation tasks with extensive experiments.

Findings

01

ViT5 outperforms existing models in Vietnamese text summarization.

02

ViT5 achieves competitive results in Vietnamese Named Entity Recognition.

03

Context length during pretraining significantly impacts downstream performance.

Abstract

We present ViT5, a pretrained Transformer-based encoder-decoder model for the Vietnamese language. With T5-style self-supervised pretraining, ViT5 is trained on a large corpus of high-quality and diverse Vietnamese texts. We benchmark ViT5 on two downstream text generation tasks, Abstractive Text Summarization and Named Entity Recognition. Although Abstractive Text Summarization has been widely studied for the English language thanks to its rich and large source of data, there has been minimal research into the same task in Vietnamese, a much lower resource language. In this work, we perform exhaustive experiments on both Vietnamese Abstractive Summarization and Named Entity Recognition, validating the performance of ViT5 against many other pretrained Transformer-based encoder-decoder models. Our experiments show that ViT5 significantly outperforms existing models and achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vietai/vit5
jaxOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies

MethodsLinear Layer · Adam · Byte Pair Encoding · Absolute Position Encodings · Residual Connection · Label Smoothing · Position-Wise Feed-Forward Layer · Dense Connections · Attention Is All You Need · Dropout