Bandit Guided Submodular Curriculum for Adaptive Subset Selection

Prateek Chanda; Prayas Agrawal; Saral Sureka; Lokesh Reddy Polu; Atharv Kshirsagar; Ganesh Ramakrishnan

arXiv:2511.22944·cs.LG·December 1, 2025

Bandit Guided Submodular Curriculum for Adaptive Subset Selection

Prateek Chanda, Prayas Agrawal, Saral Sureka, Lokesh Reddy Polu, Atharv Kshirsagar, Ganesh Ramakrishnan

PDF

Open Access 1 Video

TL;DR

This paper introduces ONLINESUBMOD, a bandit-based approach for adaptive subset selection in curriculum learning, leveraging submodular functions to improve sample selection efficiency and accuracy in vision and language tasks.

Contribution

It formulates adaptive subset selection as a multi-armed bandit problem and proposes a novel online greedy policy with provable no-regret guarantees.

Findings

01

Outperforms traditional curriculum learning methods

02

Achieves better accuracy-efficiency tradeoffs

03

Effective across vision and language datasets

Abstract

Traditional curriculum learning proceeds from easy to hard samples, yet defining a reliable notion of difficulty remains elusive. Prior work has used submodular functions to induce difficulty scores in curriculum learning. We reinterpret adaptive subset selection and formulate it as a multi-armed bandit problem, where each arm corresponds to a submodular function guiding sample selection. We introduce ONLINESUBMOD, a novel online greedy policy that optimizes a utility-driven reward and provably achieves no-regret performance under various sampling regimes. Empirically, ONLINESUBMOD outperforms both traditional curriculum learning and bi-level optimization approaches across vision and language datasets, showing superior accuracy-efficiency tradeoffs. More broadly, we show that validationdriven reward metrics offer a principled way to guide the curriculum schedule.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Bandit Guided Submodular Curriculum for Adaptive Subset Selection· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques