ViDeBERTa: A powerful pre-trained language model for Vietnamese

Cong Dao Tran; Nhut Huy Pham; Anh Nguyen; Truong Son Hy; Tu Vu

arXiv:2301.10439·cs.CL·February 13, 2023

ViDeBERTa: A powerful pre-trained language model for Vietnamese

Cong Dao Tran, Nhut Huy Pham, Anh Nguyen, Truong Son Hy, Tu Vu

PDF

Open Access 1 Repo

TL;DR

ViDeBERTa is a new Vietnamese language model based on DeBERTa architecture that outperforms previous models on key NLP tasks despite having fewer parameters, advancing Vietnamese NLP capabilities.

Contribution

Introduces ViDeBERTa, a set of pre-trained Vietnamese language models that outperform existing models on multiple NLP tasks with fewer parameters.

Findings

01

ViDeBERTa surpasses previous state-of-the-art models on Vietnamese NLP tasks.

02

ViDeBERTa_base achieves comparable or better results with only 23% of PhoBERT_large's parameters.

03

Models are publicly available for further research and application.

Abstract

This paper presents ViDeBERTa, a new pre-trained monolingual language model for Vietnamese, with three versions - ViDeBERTa_xsmall, ViDeBERTa_base, and ViDeBERTa_large, which are pre-trained on a large-scale corpus of high-quality and diverse Vietnamese texts using DeBERTa architecture. Although many successful pre-trained language models based on Transformer have been widely proposed for the English language, there are still few pre-trained models for Vietnamese, a low-resource language, that perform good results on downstream tasks, especially Question answering. We fine-tune and evaluate our model on three important natural language downstream tasks, Part-of-speech tagging, Named-entity recognition, and Question answering. The empirical results demonstrate that ViDeBERTa with far fewer parameters surpasses the previous state-of-the-art models on multiple Vietnamese-specific natural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hysonlab/videberta
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsHow do I file a dispute with Expedia?*DisputeFastService · Multi-Head Attention · Attention Is All You Need · Dense Connections · Adam · Position-Wise Feed-Forward Layer · Softmax · Linear Layer · Absolute Position Encodings · Dropout