Sparse classification with positive-confidence data in high dimensions
The Tien Mai, Mai Anh Nguyen, Trung Nghia Nguyen

TL;DR
This paper develops a sparse classification method for high-dimensional positive-confidence data, introducing new estimators with theoretical guarantees and an efficient algorithm, bridging weak supervision and high-dimensional statistics.
Contribution
It proposes a novel sparse-penalization framework for high-dimensional Pconf classification with theoretical error bounds and an efficient optimization algorithm.
Findings
Achieves near minimax-optimal sparse recovery rates.
Demonstrates competitive predictive performance with fully supervised methods.
Effectively recovers relevant features in high-dimensional weakly supervised data.
Abstract
High-dimensional learning problems, where the number of features exceeds the sample size, often require sparse regularization for effective prediction and variable selection. While established for fully supervised data, these techniques remain underexplored in weak-supervision settings such as Positive-Confidence (Pconf) classification. Pconf learning utilizes only positive samples equipped with confidence scores, thereby avoiding the need for negative data. However, existing Pconf methods are ill-suited for high-dimensional regimes. This paper proposes a novel sparse-penalization framework for high-dimensional Pconf classification. We introduce estimators using convex (Lasso) and non-convex (SCAD, MCP) penalties to address shrinkage bias and improve feature recovery. Theoretically, we establish estimation and prediction error bounds for the L1-regularized Pconf estimator, proving it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
