Investigating Monolingual and Multilingual BERTModels for Vietnamese Aspect Category Detection
Dang Van Thin, Lac Si Le, Vu Xuan Hoang, Ngan Luu-Thuy Nguyen

TL;DR
This paper compares monolingual and multilingual BERT models for Vietnamese aspect category detection, showing that PhoBERT outperforms others and exploring multilingual models with datasets from multiple languages.
Contribution
First comprehensive evaluation of pre-trained language models on Vietnamese ACD, including the use of multilingual datasets and models for cross-lingual performance analysis.
Findings
PhoBERT outperforms other models on Vietnamese datasets.
Multilingual models benefit from combined datasets across languages.
Monolingual PhoBERT achieves higher accuracy in ACD tasks.
Abstract
Aspect category detection (ACD) is one of the challenging tasks in the Aspect-based sentiment Analysis problem. The purpose of this task is to identify the aspect categories mentioned in user-generated reviews from a set of pre-defined categories. In this paper, we investigate the performance of various monolingual pre-trained language models compared with multilingual models on the Vietnamese aspect category detection problem. We conduct the experiments on two benchmark datasets for the restaurant and hotel domain. The experimental results demonstrated the effectiveness of the monolingual PhoBERT model than others on two datasets. We also evaluate the performance of the multilingual model based on the combination of whole SemEval-2016 datasets in other languages with the Vietnamese dataset. To the best of our knowledge, our research study is the first attempt at performing various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Text and Document Classification Technologies · Advanced Text Analysis Techniques
