Hate Speech Detection on Vietnamese Social Media Text using the Bidirectional-LSTM Model
Hang Thi-Thuy Do, Huy Duc Huynh, Kiet Van Nguyen, Ngan Luu-Thuy, Nguyen, Anh Gia-Tuan Nguyen

TL;DR
This paper presents a Bidirectional-LSTM model for detecting hate speech in Vietnamese social media comments, achieving over 71% accuracy on a standard dataset, contributing to automated moderation tools.
Contribution
The paper introduces a Bidirectional-LSTM approach specifically tailored for Vietnamese hate speech detection, demonstrating competitive performance on VLSP 2019 data.
Findings
Achieved 71.43% accuracy on the VLSP 2019 test set.
Successfully classified comments into Clean, Offensive, Hate categories.
Showed that LSTM-based models are effective for Vietnamese social media text.
Abstract
In this paper, we describe our system which participates in the shared task of Hate Speech Detection on Social Networks of VLSP 2019 evaluation campaign. We are provided with the pre-labeled dataset and an unlabeled dataset for social media comments or posts. Our mission is to pre-process and build machine learning models to classify comments/posts. In this report, we use Bidirectional Long Short-Term Memory to build the model that can predict labels for social media text according to Clean, Offensive, Hate. With this system, we achieve comparative results with 71.43% on the public standard test set of VLSP 2019.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
MethodsTest
