Reducing Labeling Costs in Sentiment Analysis via Semi-Supervised   Learning

Minoo Jafarlou; Mario M. Kubek

arXiv:2410.11355·cs.LG·October 16, 2024

Reducing Labeling Costs in Sentiment Analysis via Semi-Supervised Learning

Minoo Jafarlou, Mario M. Kubek

PDF

Open Access

TL;DR

This paper presents a semi-supervised learning approach using label propagation and graph-based pseudo-labeling to significantly cut down labeling costs in sentiment analysis, demonstrating effectiveness in text classification.

Contribution

It introduces a novel graph-based semi-supervised method leveraging manifold assumptions and deep neural networks for sentiment analysis, reducing the need for extensive labeled data.

Findings

01

Effective reduction in labeling costs demonstrated

02

Pseudo-labeling improves classification accuracy

03

Method outperforms traditional supervised approaches

Abstract

Labeling datasets is a noteworthy challenge in machine learning, both in terms of cost and time. This research, however, leverages an efficient answer. By exploring label propagation in semi-supervised learning, we can significantly reduce the number of labels required compared to traditional methods. We employ a transductive label propagation method based on the manifold assumption for text classification. Our approach utilizes a graph-based method to generate pseudo-labels for unlabeled data for the text classification task, which are then used to train deep neural networks. By extending labels based on cosine proximity within a nearest neighbor graph from network embeddings, we combine unlabeled data into supervised learning, thereby reducing labeling costs. Based on previous successes in other domains, this study builds and evaluates this approach's effectiveness in sentiment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining