The Tonogenesis Continuum in Tibetan: A Computational Investigation
Siyu Liang, Zhaxi Zerong

TL;DR
This study uses computational methods to analyze the gradual evolution of tonal features in Tibetan languages, revealing a continuum from non-tonal to tonal systems through speech recognition performance.
Contribution
It introduces a novel computational approach to quantify the functional role of pitch during tonogenesis, providing empirical evidence of a gradual transition in Tibetan dialects.
Findings
Atonal Amdo dialects tolerate pitch removal best
Fully tonal U-Tsang dialects show severe degradation with pitch flattening
Intermediate Kham dialects exhibit intermediate sensitivity
Abstract
Tonogenesis-the historical process by which segmental contrasts evolve into lexical tone-has traditionally been studied through comparative reconstruction and acoustic phonetics. We introduce a computational approach that quantifies the functional role of pitch at different stages of this sound change by measuring how pitch manipulation affects automatic speech recognition (ASR) performance. Through analysis on the sensitivity to pitch-flattening from a set of closely related Tibetan languages, we find evidence of a tonogenesis continuum: atonal Amdo dialects tolerate pitch removal the most, while fully tonal U-Tsang varieties show severe degradation, and intermediate Kham dialects fall measurably between these extremes. These gradient effects demonstrate how ASR models implicitly learn the shifting functional load of pitch as languages transition from consonant-based to tone-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
