Primal Estimated Subgradient Solver for SVM for Imbalanced Classification
John Sun

TL;DR
This paper introduces a cost-sensitive PEGASOS SVM for imbalanced classification, demonstrating competitive performance with less computational cost and extending prior work by incorporating kernels and using Python for implementation.
Contribution
The paper presents a novel cost-sensitive PEGASOS SVM approach that handles imbalanced data efficiently and extends previous methods by adding kernel support and Python implementation.
Findings
Achieves good performance on highly imbalanced datasets
Extends prior linear SVM methods with kernel integration
Uses learning and validation curves to analyze overfitting and hyperparameter effects
Abstract
We aim to demonstrate in experiments that our cost sensitive PEGASOS SVM achieves good performance on imbalanced data sets with a Majority to Minority Ratio ranging from 8.6:1 to 130:1 and to ascertain whether the including intercept (bias), regularization and parameters affects performance on our selection of datasets. Although many resort to SMOTE methods, we aim for a less computationally intensive method. We evaluate the performance by examining the learning curves. These curves diagnose whether we overfit or underfit or whether the random sample of data chosen during the process was not random enough or diverse enough in dependent variable class for the algorithm to generalized to unseen examples. We will also see the background of the hyperparameters versus the test and train error in validation curves. We benchmark our PEGASOS Cost-Sensitive SVM's results of Ding's LINEAR SVM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Text and Document Classification Technologies
MethodsTest · Synthetic Minority Over-sampling Technique. · Support Vector Machine
