Linguistically Informed Graph Model and Semantic Contrastive Learning for Korean Short Text Classification

JaeGeon Yoo; Byoungwook Kim; Yeongwook Yang; and Hong-Jun Jang

arXiv:2603.03652·cs.CL·March 5, 2026

Linguistically Informed Graph Model and Semantic Contrastive Learning for Korean Short Text Classification

JaeGeon Yoo, Byoungwook Kim, Yeongwook Yang, and Hong-Jun Jang

PDF

Open Access

TL;DR

This paper introduces LIGRAM, a hierarchical graph model combined with semantic contrastive learning, specifically designed to improve Korean short text classification by capturing linguistic features and semantic similarities.

Contribution

The paper presents a novel hierarchical graph model and semantic contrastive learning approach tailored for Korean, addressing language-specific challenges in short text classification.

Findings

01

LIGRAM outperforms baseline models on Korean datasets.

02

Hierarchical graph construction effectively captures Korean linguistic features.

03

Semantic contrastive learning enhances class distinction in short texts.

Abstract

Short text classification (STC) remains a challenging task due to the scarcity of contextual information and labeled data. However, existing approaches have pre-dominantly focused on English because most benchmark datasets for the STC are primarily available in English. Consequently, existing methods seldom incorporate the linguistic and structural characteristics of Korean, such as its agglutinative morphology and flexible word order. To address these limitations, we propose LIGRAM, a hierarchical heterogeneous graph model for Korean short-text classification. The proposed model constructs sub-graphs at the morpheme, part-of-speech, and named-entity levels and hierarchically integrates them to compensate for the limited contextual information in short texts while precisely capturing the grammatical and semantic dependencies inherent in Korean. In addition, we apply Semantics-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text and Document Classification Technologies · Text Readability and Simplification