Graph Construction for Learning with Unbalanced Data
Jing Qian, Venkatesh Saligrama, Manqi Zhao

TL;DR
This paper introduces a novel graph construction method that uses global statistical information to improve learning with unbalanced data, addressing failures of traditional graphs like k-NN.
Contribution
The paper proposes a rank-modulated degree scheme that encodes density information into graph construction, enhancing performance on unbalanced datasets.
Findings
Significantly improves graph-based learning on unbalanced data
Theoretical justification via limit cut analysis
Demonstrated effectiveness on synthetic and real datasets
Abstract
Unbalanced data arises in many learning tasks such as clustering of multi-class data, hierarchical divisive clustering and semisupervised learning. Graph-based approaches are popular tools for these problems. Graph construction is an important aspect of graph-based learning. We show that graph-based algorithms can fail for unbalanced data for many popular graphs such as k-NN, \epsilon-neighborhood and full-RBF graphs. We propose a novel graph construction technique that encodes global statistical information into node degrees through a ranking scheme. The rank of a data sample is an estimate of its p-value and is proportional to the total number of data samples with smaller density. This ranking scheme serves as a surrogate for density; can be reliably estimated; and indicates whether a data sample is close to valleys/modes. This rank-modulated degree(RMD) scheme is able to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Text and Document Classification Technologies · Face and Expression Recognition
