Leveraging Perceptual Scores for Dataset Pruning in Computer Vision Tasks
Raghavendra Singh

TL;DR
This paper introduces an entropy-based image score derived from compressed image bits-per-pixel for dataset pruning, aiming to select diverse, perceptually complex images efficiently without supervision, improving semantic segmentation performance.
Contribution
It proposes a simple, intrinsic image score based on compression entropy for dataset pruning, enhancing diversity and performance in vision tasks without additional computational costs.
Findings
Entropy-based scores effectively select diverse images
Graph-based method improves spatial diversity of samples
Method yields strong results in semantic segmentation
Abstract
In this paper we propose a score of an image to use for coreset selection in image classification and semantic segmentation tasks. The score is the entropy of an image as approximated by the bits-per-pixel of its compressed version. Thus the score is intrinsic to an image and does not require supervision or training. It is very simple to compute and readily available as all images are stored in a compressed format. The motivation behind our choice of score is that most other scores proposed in literature are expensive to compute. More importantly, we want a score that captures the perceptual complexity of an image. Entropy is one such measure, images with clutter tend to have a higher entropy. However sampling only low entropy iconic images, for example, leads to biased learning and an overall decrease in test performance with current deep learning models. To mitigate the bias we use a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Industrial Vision Systems and Defect Detection
