Optimizing Data Curation through Spectral Analysis and Joint Batch Selection (SALN)
Mohammadreza Sharifi

TL;DR
SALN is a novel batch selection method using spectral analysis that significantly accelerates training and improves accuracy in deep learning models by prioritizing informative samples within each batch.
Contribution
This paper introduces SALN, a spectral analysis-based batch selection method that enhances training efficiency and accuracy by jointly selecting informative samples within each batch.
Findings
Up to 8x reduction in training time.
Up to 5% increase in accuracy.
Outperforms Google's JEST method.
Abstract
In modern deep learning models, long training times and large datasets present significant challenges to both efficiency and scalability. Effective data curation and sample selection are crucial for optimizing the training process of deep neural networks. This paper introduces SALN, a method designed to prioritize and select samples within each batch rather than from the entire dataset. By utilizing jointly selected batches, SALN enhances training efficiency compared to independent batch selection. The proposed method applies a spectral analysis-based heuristic to identify the most informative data points within each batch, improving both training speed and accuracy. The SALN algorithm significantly reduces training time and enhances accuracy when compared to traditional batch prioritization or standard training procedures. It demonstrates up to an 8x reduction in training time and up…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Artificial Intelligence in Healthcare · Data Quality and Management
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
