Self-supervised Regularization for Text Classification
Meng Zhou, Zechen Li, Pengtao Xie

TL;DR
This paper introduces SSL-Reg, a self-supervised regularization method for text classification that improves model generalization when training data is limited by combining supervised and unsupervised tasks.
Contribution
The paper proposes a novel SSL-based regularization technique that enhances text classification performance with limited labeled data by integrating self-supervised auxiliary tasks.
Findings
SSL-Reg outperforms baseline models on 17 datasets.
The method reduces overfitting in low-data scenarios.
Self-supervised auxiliary tasks improve generalization.
Abstract
Text classification is a widely studied problem and has broad applications. In many real-world problems, the number of texts for training classification models is limited, which renders these models prone to overfitting. To address this problem, we propose SSL-Reg, a data-dependent regularization approach based on self-supervised learning (SSL). SSL is an unsupervised learning approach which defines auxiliary tasks on input data without using any human-provided labels and learns data representations by solving these auxiliary tasks. In SSL-Reg, a supervised classification task and an unsupervised SSL task are performed simultaneously. The SSL task is unsupervised, which is defined purely on input texts without using any human-provided labels. Training a model using an SSL task can prevent the model from being overfitted to a limited number of class labels in the classification task.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Topic Modeling · Natural Language Processing Techniques
