AEMLO: AutoEncoder-Guided Multi-Label Oversampling
Ao Zhou, Bin Liu, Jin Wang, Kaiwei Sun, Kelin Liu

TL;DR
AEMLO introduces an AutoEncoder-guided oversampling method that generates diverse synthetic samples for imbalanced multi-label datasets, improving classifier performance over existing techniques.
Contribution
The paper proposes a novel AutoEncoder-based oversampling approach specifically designed for multi-label imbalance, with a tailored objective function for better synthetic sample generation.
Findings
AEMLO outperforms state-of-the-art oversampling methods in empirical tests.
The method effectively generates diverse synthetic samples for imbalanced multi-label data.
AEMLO improves multi-label classifier performance on benchmark datasets.
Abstract
Class imbalance significantly impacts the performance of multi-label classifiers. Oversampling is one of the most popular approaches, as it augments instances associated with less frequent labels to balance the class distribution. Existing oversampling methods generate feature vectors of synthetic samples through replication or linear interpolation and assign labels through neighborhood information. Linear interpolation typically generates new samples between existing data points, which may result in insufficient diversity of synthesized samples and further lead to the overfitting issue. Deep learning-based methods, such as AutoEncoders, have been proposed to generate more diverse and complex synthetic samples, achieving excellent performance on imbalanced binary or multi-class datasets. In this study, we introduce AEMLO, an AutoEncoder-guided Oversampling technique specifically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
