Hierarchical Character Embeddings: Learning Phonological and Semantic   Representations in Languages of Logographic Origin using Recursive Neural   Networks

Minh Nguyen; Gia H. Ngo; Nancy F. Chen

arXiv:1912.09913·cs.CL·June 29, 2020

Hierarchical Character Embeddings: Learning Phonological and Semantic Representations in Languages of Logographic Origin using Recursive Neural Networks

Minh Nguyen, Gia H. Ngo, Nancy F. Chen

PDF

1 Repo

TL;DR

This paper introduces hierarchical logograph embeddings using recursive neural networks, leveraging their recursive structures to improve phonological prediction and language modeling, outperforming baseline methods.

Contribution

It proposes a novel recursive neural network approach to embed logographs based on their hierarchical structures, enhancing phonological and semantic task performance.

Findings

01

Hierarchical embeddings outperform baseline approaches.

02

Embeddings are more robust to distractors, especially on complex logographs.

03

Recursive neural networks effectively capture logograph structures.

Abstract

Logographs (Chinese characters) have recursive structures (i.e. hierarchies of sub-units in logographs) that contain phonological and semantic information, as developmental psychology literature suggests that native speakers leverage on the structures to learn how to read. Exploiting these structures could potentially lead to better embeddings that can benefit many downstream tasks. We propose building hierarchical logograph (character) embeddings from logograph recursive structures using treeLSTM, a recursive neural network. Using recursive neural network imposes a prior on the mapping from logographs to embeddings since the network must read in the sub-units in logographs according to the order specified by the recursive structures. Based on human behavior in language learning and reading, we hypothesize that modeling logographs' structures using recursive neural network should be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mnhng/hier-char-emb
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.