Offline Clustering Approach to Self-supervised Learning for Class-imbalanced Image Data
Hye-min Chang, Sungkyun Chang

TL;DR
This paper investigates how class-imbalance affects self-supervised learning and proposes an offline clustering method to improve pre-training on imbalanced image datasets, demonstrating potential performance gains.
Contribution
It introduces an offline clustering approach to enhance self-supervised pre-training on class-imbalanced data, addressing bias issues in representation learning.
Findings
Class-imbalance impacts self-supervised pre-training effectiveness.
Offline clustering of features can improve model performance on imbalanced data.
Knowledge distillation from expert models enhances overall learning.
Abstract
Class-imbalanced datasets are known to cause the problem of model being biased towards the majority classes. In this project, we set up two research questions: 1) when is the class-imbalance problem more prevalent in self-supervised pre-training? and 2) can offline clustering of feature representations help pre-training on class-imbalanced data? Our experiments investigate the former question by adjusting the degree of {\it class-imbalance} when training the baseline models, namely SimCLR and SimSiam on CIFAR-10 database. To answer the latter question, we train each expert model on each subset of the feature clusters. We then distill the knowledge of expert models into a single model, so that we will be able to compare the performance of this model to our baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Digital Imaging for Blood Diseases · Retinal Imaging and Analysis
MethodsBitcoin Customer Service Number +1-833-534-1729 · Average Pooling · Batch Normalization · Residual Block · Global Average Pooling · 1x1 Convolution · Kaiming Initialization · Convolution · Dense Connections · Residual Connection
