A New Korean Text Classification Benchmark for Recognizing the Political Intents in Online Newspapers
Beomjune Kim, Eunsun Lee, Dongbin Na

TL;DR
This paper introduces a large-scale Korean news dataset for classifying political intent in online articles, along with baseline models demonstrating effective performance on multi-task classification tasks.
Contribution
It presents the first large-scale Korean news dataset with multi-task labels for political orientation and pro-government stance, and provides baseline deep learning models.
Findings
Deep learning models achieve decent classification accuracy.
The dataset is the largest of its kind for Korean political news.
Models trained on the dataset outperform previous benchmarks.
Abstract
Many users reading online articles in various magazines may suffer considerable difficulty in distinguishing the implicit intents in texts. In this work, we focus on automatically recognizing the political intents of a given online newspaper by understanding the context of the text. To solve this task, we present a novel Korean text classification dataset that contains various articles. We also provide deep-learning-based text classification baseline models trained on the proposed dataset. Our dataset contains 12,000 news articles that may contain political intentions, from the politics section of six of the most representative newspaper organizations in South Korea. All the text samples are labeled simultaneously in two aspects (1) the level of political orientation and (2) the level of pro-government. To the best of our knowledge, our paper is the most large-scale Korean news dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods
MethodsFocus
