Explorations in Self-Supervised Learning: Dataset Composition Testing for Object Classification
Raynor Kirkson E. Chavez, Kyle Gabriel M. Reynoso

TL;DR
This study examines how different dataset characteristics like modality, luminosity, and image size influence self-supervised learning performance in object classification, revealing specific conditions where certain pretraining strategies excel.
Contribution
It introduces a systematic analysis of dataset composition effects on SSL models, highlighting optimal pretraining conditions based on image properties.
Findings
Depth pretraining benefits low-resolution images.
RGB pretraining outperforms on high-resolution images.
Increasing luminosity improves low-resolution model performance.
Abstract
This paper investigates the impact of sampling and pretraining using datasets with different image characteristics on the performance of self-supervised learning (SSL) models for object classification. To do this, we sample two apartment datasets from the Omnidata platform based on modality, luminosity, image size, and camera field of view and use them to pretrain a SimCLR model. The encodings generated from the pretrained model are then transferred to a supervised Resnet-50 model for object classification. Through A/B testing, we find that depth pretrained models are more effective on low resolution images, while RGB pretrained models perform better on higher resolution images. We also discover that increasing the luminosity of training images can improve the performance of models on low resolution images without negatively affecting their performance on higher resolution images.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications
MethodsBitcoin Customer Service Number +1-833-534-1729 · Convolution · Max Pooling · Average Pooling · Global Average Pooling · Kaiming Initialization · Dense Connections · Feedforward Network · Random Resized Crop · Color Jitter
