Merged-GHCIDR: Geometrical Approach to Reduce Image Data
Devvrat Joshi, Janvi Thakkar, Siddharth Soni, Shril Mody, Rohan Patil,, Nipun Batra

TL;DR
This paper introduces Merged-GHCIDR, a geometrical clustering method to reduce dataset size for training deep neural networks, improving accuracy and efficiency across multiple datasets.
Contribution
It proposes a novel geometrical clustering approach, Merged-GHCIDR, that enhances data reduction techniques for deep learning datasets, leading to better accuracy and training time.
Findings
Merged-GHCIDR improves accuracy by up to 8.9% on Fashion-MNIST.
The method reduces dataset size while maintaining or increasing model accuracy.
Experiments on four datasets demonstrate the effectiveness of the proposed approach.
Abstract
The computational resources required to train a model have been increasing since the inception of deep networks. Training neural networks on massive datasets have become a challenging and time-consuming task. So, there arises a need to reduce the dataset without compromising the accuracy. In this paper, we present novel variations of an earlier approach called reduction through homogeneous clustering for reducing dataset size. The proposed methods are based on the idea of partitioning the dataset into homogeneous clusters and selecting images that contribute significantly to the accuracy. We propose two variations: Geometrical Homogeneous Clustering for Image Data Reduction (GHCIDR) and Merged-GHCIDR upon the baseline algorithm - Reduction through Homogeneous Clustering (RHC) to achieve better accuracy and training time. The intuition behind GHCIDR involves selecting data points by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
