On the Vietnamese Name Entity Recognition: A Deep Learning Method Approach
Ngoc C. L\^e, Ngoc-Yen Nguyen, and Anh-Duong Trinh

TL;DR
This paper introduces a deep learning approach combining Bi-LSTM and CRF for Vietnamese NER, utilizing word embeddings and semantic features to improve accuracy on the VLSP2016 dataset.
Contribution
It presents a novel deep learning model that integrates word embeddings, semantic, and syntactic features for Vietnamese NER, achieving state-of-the-art results.
Findings
Achieved the best results on VLSP2016 dataset
Enhanced NER accuracy with combined semantic and syntactic features
Demonstrated effectiveness of Bi-LSTM-CRF architecture for Vietnamese NER
Abstract
Named entity recognition (NER) plays an important role in text-based information retrieval. In this paper, we combine Bidirectional Long Short-Term Memory (Bi-LSTM) \cite{hochreiter1997,schuster1997} with Conditional Random Field (CRF) \cite{lafferty2001} to create a novel deep learning model for the NER problem. Each word as input of the deep learning model is represented by a Word2vec-trained vector. A word embedding set trained from about one million articles in 2018 collected through a Vietnamese news portal (baomoi.com). In addition, we concatenate a Word2Vec\cite{mikolov2013}-trained vector with semantic feature vector (Part-Of-Speech (POS) tagging, chunk-tag) and hidden syntactic feature vector (extracted by Bi-LSTM nerwork) to achieve the (so far best) result in Vietnamese NER system. The result was conducted on the data set VLSP2016 (Vietnamese Language and Speech Processing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
