Semi-Supervised Learning with Balanced Deep Representation Distributions
Changchun Li, Ximing Li, Bingjie Zhang, Wenting Wang, Jihong Ouyang

TL;DR
This paper introduces S2TC-BDD, a semi-supervised text classification method that balances deep representation distributions using angular margin loss and Gaussian transformations, improving accuracy especially with limited labeled data.
Contribution
The paper proposes a novel semi-supervised text classification approach that balances label representation distributions, enhancing pseudo-label accuracy and overall performance.
Findings
S2TC-BDD outperforms state-of-the-art SSTC methods.
Balanced label angle variances improve pseudo-label accuracy.
Effective especially with scarce labeled data.
Abstract
Semi-Supervised Text Classification (SSTC) mainly works under the spirit of self-training. They initialize the deep classifier by training over labeled texts; and then alternatively predict unlabeled texts as their pseudo-labels and train the deep classifier over the mixture of labeled and pseudo-labeled texts. Naturally, their performance is largely affected by the accuracy of pseudo-labels for unlabeled texts. Unfortunately, they often suffer from low accuracy because of the margin bias problem caused by the large difference between representation distributions of labels in SSTC. To alleviate this problem, we apply the angular margin loss, and perform several Gaussian linear transformations to achieve balanced label angle variances, i.e., the variance of label angles of texts within the same label. More accuracy of predicted pseudo-labels can be achieved by constraining all label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
