Diversified Batch Selection for Training Acceleration

Feng Hong; Yueming Lyu; Jiangchao Yao; Ya Zhang; Ivor W. Tsang,; Yanfeng Wang

arXiv:2406.04872·cs.LG·June 10, 2024

Diversified Batch Selection for Training Acceleration

Feng Hong, Yueming Lyu, Jiangchao Yao, Ya Zhang, Ivor W. Tsang,, Yanfeng Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Diversified Batch Selection (DivBS), a reference-model-free method that efficiently selects diverse, representative samples to accelerate training without sacrificing performance, outperforming existing methods in various tasks.

Contribution

DivBS proposes a novel group-wise orthogonalized representativeness criterion for diversified batch selection, addressing redundancy and improving training efficiency.

Findings

01

DivBS achieves better performance-speedup trade-offs.

02

Extensive experiments validate DivBS's effectiveness across tasks.

03

DivBS outperforms existing reference-model-free methods.

Abstract

The remarkable success of modern machine learning models on large datasets often demands extensive training time and resource consumption. To save cost, a prevalent research line, known as online batch selection, explores selecting informative subsets during the training process. Although recent efforts achieve advancements by measuring the impact of each sample on generalization, their reliance on additional reference models inherently limits their practical applications, when there are no such ideal models available. On the other hand, the vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner, which sacrifices the diversity and induces the redundancy. To tackle this dilemma, we propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples. Specifically, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Feng-Hong/DivBS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization