Deep Supervised Contrastive Learning of Pitch Contours for Robust Pitch Accent Classification in Seoul Korean

Hyunjung Joo; GyeongTaek Lee

arXiv:2604.19477·cs.SD·April 22, 2026

Deep Supervised Contrastive Learning of Pitch Contours for Robust Pitch Accent Classification in Seoul Korean

Hyunjung Joo, GyeongTaek Lee

PDF

TL;DR

This paper introduces Dual-Glob, a deep contrastive learning framework that improves classification of pitch accents in Seoul Korean by capturing holistic $F_0$ contour shapes, supported by a new large-scale annotated dataset.

Contribution

It presents the first large-scale benchmark dataset and a novel deep contrastive learning method for robust pitch accent classification in Seoul Korean.

Findings

01

Dual-Glob achieves 77.75% accuracy and 51.54% F1-score, outperforming baseline models.

02

The approach effectively captures structural features of $F_0$ contours.

03

The dataset contains 10,093 manually annotated Accentual Phrases.

Abstract

The intonational structure of Seoul Korean has been defined with discrete tonal categories within the Autosegmental-Metrical model of intonational phonology. However, it is challenging to map continuous $F_{0}$ contours to these invariant categories due to variable $F_{0}$ realizations in real-world speech. Our paper proposes Dual-Glob, a deep supervised contrastive learning framework to robustly classify fine-grained pitch accent patterns in Seoul Korean. Unlike conventional local predictive models, our approach captures holistic $F_{0}$ contour shapes by enforcing structural consistency between clean and augmented views in a shared latent space. To this aim, we introduce the first large-scale benchmark dataset, consisting of manually annotated 10,093 Accentual Phrases in Seoul Korean. Experimental results show that our Dual-Glob significantly outperforms strong baseline models with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.