Taylor's law for Human Linguistic Sequences

Tatsuru Kobayashi; Kumiko Tanaka-Ishii

arXiv:1804.07893·cs.CL·June 8, 2018·1 cites

Taylor's law for Human Linguistic Sequences

Tatsuru Kobayashi, Kumiko Tanaka-Ishii

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new way to quantify linguistic complexity using Taylor's law, analyzing over 1100 texts across multiple languages and data types, and demonstrating its usefulness in evaluating language models.

Contribution

It applies Taylor's law to natural language, revealing consistent exponents across languages and data types, and demonstrates its potential in assessing language model performance.

Findings

01

Taylor exponents are consistent across languages and text types

02

The exponent quantifies structural complexity in linguistic sequences

03

Findings aid in evaluating language models

Abstract

Taylor's law describes the fluctuation characteristics underlying a system in which the variance of an event within a time span grows by a power law with respect to the mean. Although Taylor's law has been applied in many natural and social systems, its application for language has been scarce. This article describes a new quantification of Taylor's law in natural language and reports an analysis of over 1100 texts across 14 languages. The Taylor exponents of written natural language texts were found to exhibit almost the same value. The exponent was also compared for other language-related data, such as the child-directed speech, music, and programming language code. The results show how the Taylor exponent serves to quantify the fundamental structural complexity underlying linguistic time series. The article also shows the applicability of these findings in evaluating language models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Group-TanakaIshii/word_taylor
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Fractal and DNA sequence analysis · Topic Modeling