CAST: Cluster-Aware Self-Training for Tabular Data via Reliable Confidence
Minwook Kim, Juseong Kim, Ki Beom Kim, Giltae Song

TL;DR
CAST introduces a cluster-aware self-training method for tabular data that calibrates confidence based on local density, significantly improving performance and robustness over existing approaches.
Contribution
It proposes a novel cluster-aware confidence calibration for self-training on tabular data, enhancing accuracy while preserving simplicity and versatility.
Findings
Outperforms existing self-training methods on 21 datasets.
Improves robustness in various self-training setups.
Calibrates confidence based on local density, reducing errors.
Abstract
Tabular data is one of the most widely used data modalities, encompassing numerous datasets with substantial amounts of unlabeled data. Despite this prevalence, there is a notable lack of simple and versatile methods for utilizing unlabeled data in the tabular domain, where both gradient-boosting decision trees and neural networks are employed. In this context, self-training has gained attraction due to its simplicity and versatility, yet it is vulnerable to noisy pseudo-labels caused by erroneous confidence. Several solutions have been proposed to handle this problem, but they often compromise the inherent advantages of self-training, resulting in limited applicability in the tabular domain. To address this issue, we explore a novel direction of reliable confidence in self-training contexts and conclude that self-training can be improved by making that the confidence, which represents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Anomaly Detection Techniques and Applications
MethodsAttentive Walk-Aggregating Graph Neural Network
