Online AUC Optimization for Sparse High-Dimensional Datasets
Baojian Zhou, Yiming Ying, Steven Skiena

TL;DR
This paper introduces extsc{FTRL-AUC}, an online algorithm optimized for high-dimensional sparse data that reduces computational complexity from d7 to d7 per iteration, improving efficiency and sparsity in AUC maximization.
Contribution
The paper presents a novel online AUC optimization algorithm with d7 lower per-iteration complexity and better sparsity handling, suitable for streaming high-dimensional sparse datasets.
Findings
d7 faster per-iteration processing compared to existing methods
Significantly improved model sparsity in experiments
Achieved competitive AUC scores, especially on imbalanced datasets
Abstract
The Area Under the ROC Curve (AUC) is a widely used performance measure for imbalanced classification arising from many application domains where high-dimensional sparse data is abundant. In such cases, each dimensional sample has only non-zero features with , and data arrives sequentially in a streaming form. Current online AUC optimization algorithms have high per-iteration cost and usually produce non-sparse solutions in general, and hence are not suitable for handling the data challenge mentioned above. In this paper, we aim to directly optimize the AUC score for high-dimensional sparse datasets under online learning setting and propose a new algorithm, \textsc{FTRL-AUC}. Our proposed algorithm can process data in an online fashion with a much cheaper per-iteration cost , making it amenable for high-dimensional sparse streaming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Imbalanced Data Classification Techniques · Sparse and Compressive Sensing Techniques
