Network of two-Chinese-character compound words in Japanese language

Ken Yamamoto; Yoshihiro Yamazaki

arXiv:0902.4060·cs.CL·May 15, 2012

Network of two-Chinese-character compound words in Japanese language

Ken Yamamoto, Yoshihiro Yamazaki

PDF

TL;DR

This paper analyzes the network structure of two-Chinese-character compound words in Japanese, revealing small-world and scale-free properties, and proposes a model for character selection affecting degree distribution.

Contribution

It identifies key network properties of Japanese compound words and introduces a model explaining the absence of a clear power-law degree distribution.

Findings

01

The network exhibits small-world and scale-free properties.

02

The common-use Chinese character network also has small-world features.

03

A proposed model explains the disappearance of power-law distribution.

Abstract

Some statistical properties of a network of two-Chinese-character compound words in Japanese language are reported. In this network, a node represents a Chinese character and an edge represents a two-Chinese-character compound word. It is found that this network has properties of "small-world" and "scale-free." A network formed by only Chinese characters for common use ({\it joyo-kanji} in Japanese), which is regarded as a subclass of the original network, also has small-world property. However, a degree distribution of the network exhibits no clear power law. In order to reproduce disappearance of the power-law property, a model for a selecting process of the Chinese characters for common use is proposed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.