Monolingual versus Multilingual BERTology for Vietnamese Extractive   Multi-Document Summarization

Huy Quoc To; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen; Anh Gia-Tuan; Nguyen

arXiv:2108.13741·cs.CL·October 19, 2021·6 cites

Monolingual versus Multilingual BERTology for Vietnamese Extractive Multi-Document Summarization

Huy Quoc To, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan, Nguyen

PDF

Open Access

TL;DR

This paper explores the effectiveness of monolingual versus multilingual BERT models for extractive multi-document summarization in Vietnamese, showing monolingual models outperform multilingual ones.

Contribution

It introduces a comparative analysis of monolingual and multilingual BERT models specifically for Vietnamese multi-document summarization.

Findings

01

Monolingual BERT models outperform multilingual models in Vietnamese summarization.

02

The proposed approach achieves promising results compared to previous models.

03

Multilingual BERT models are less effective than monolingual models for Vietnamese NLP tasks.

Abstract

Recent researches have demonstrated that BERT shows potential in a wide range of natural language processing tasks. It is adopted as an encoder for many state-of-the-art automatic summarizing systems, which achieve excellent performance. However, so far, there is not much work done for Vietnamese. In this paper, we showcase how BERT can be implemented for extractive text summarization in Vietnamese on multi-document. We introduce a novel comparison between different multilingual and monolingual BERT models. The experiment results indicate that monolingual models produce promising results compared to other multilingual models and previous text summarizing models for Vietnamese.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Layer Normalization · Dense Connections · Attention Dropout