VCWE: Visual Character-Enhanced Word Embeddings
Chi Sun, Xipeng Qiu, Xuanjing Huang

TL;DR
This paper introduces VCWE, a model that leverages visual features of Chinese characters and neural networks to generate more effective word embeddings, improving performance across multiple NLP tasks.
Contribution
It presents a novel three-level composition model combining visual character features, recurrent self-attention, and contextual learning for Chinese word embeddings.
Findings
Outperforms existing models on word similarity tasks
Achieves higher accuracy in sentiment analysis
Improves results in named entity recognition and POS tagging
Abstract
Chinese is a logographic writing system, and the shape of Chinese characters contain rich syntactic and semantic information. In this paper, we propose a model to learn Chinese word embeddings via three-level composition: (1) a convolutional neural network to extract the intra-character compositionality from the visual shape of a character; (2) a recurrent neural network with self-attention to compose character representation into word embeddings; (3) the Skip-Gram framework to capture non-compositionality directly from the contextual information. Evaluations demonstrate the superior performance of our model on four tasks: word similarity, sentiment analysis, named entity recognition and part-of-speech tagging.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
