CLIP: Train Faster with Less Data
Muhammad Asif Khan, Ridha Hamila, and Hamid Menouar

TL;DR
This paper introduces CLIP, a novel data-centric training method that combines curriculum learning and dataset pruning to enhance model accuracy and speed up convergence with less data.
Contribution
It presents a new approach integrating loss-aware dataset pruning within curriculum learning, which is a novel concept in data-centric deep learning.
Findings
Reduces convergence time significantly
Improves generalization performance
Validates effectiveness on crowd density estimation models
Abstract
Deep learning models require an enormous amount of data for training. However, recently there is a shift in machine learning from model-centric to data-centric approaches. In data-centric approaches, the focus is to refine and improve the quality of the data to improve the learning performance of the models rather than redesigning model architectures. In this paper, we propose CLIP i.e., Curriculum Learning with Iterative data Pruning. CLIP combines two data-centric approaches i.e., curriculum learning and dataset pruning to improve the model learning accuracy and convergence speed. The proposed scheme applies loss-aware dataset pruning to iteratively remove the least significant samples and progressively reduces the size of the effective dataset in the curriculum learning training. Extensive experiments performed on crowd density estimation models validate the notion behind combining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · COVID-19 diagnosis using AI · Data Stream Mining Techniques
MethodsDataset Pruning · Pruning · Contrastive Language-Image Pre-training
