VCWE: Visual Character-Enhanced Word Embeddings

Chi Sun; Xipeng Qiu; Xuanjing Huang

arXiv:1902.08795·cs.CL·March 26, 2019·1 cites

VCWE: Visual Character-Enhanced Word Embeddings

Chi Sun, Xipeng Qiu, Xuanjing Huang

PDF

Open Access 1 Repo

TL;DR

This paper introduces VCWE, a model that leverages visual features of Chinese characters and neural networks to generate more effective word embeddings, improving performance across multiple NLP tasks.

Contribution

It presents a novel three-level composition model combining visual character features, recurrent self-attention, and contextual learning for Chinese word embeddings.

Findings

01

Outperforms existing models on word similarity tasks

02

Achieves higher accuracy in sentiment analysis

03

Improves results in named entity recognition and POS tagging

Abstract

Chinese is a logographic writing system, and the shape of Chinese characters contain rich syntactic and semantic information. In this paper, we propose a model to learn Chinese word embeddings via three-level composition: (1) a convolutional neural network to extract the intra-character compositionality from the visual shape of a character; (2) a recurrent neural network with self-attention to compose character representation into word embeddings; (3) the Skip-Gram framework to capture non-compositionality directly from the contextual information. Evaluations demonstrate the superior performance of our model on four tasks: word similarity, sentiment analysis, named entity recognition and part-of-speech tagging.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HSLCY/VCWE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining