Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

Sajad Ashkezari; Shai Ben-David

arXiv:2602.17103·cs.LG·February 20, 2026

Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

Sajad Ashkezari, Shai Ben-David

PDF

Open Access

TL;DR

This paper explores a model of online learning where agents can modify features to achieve better labels, extending theoretical understanding to multiclass, bandit feedback, and cost modeling scenarios.

Contribution

It provides new combinatorial dimensions for characterizing online learnability, extending prior work to multiclass, bandit feedback, and cost-aware settings.

Findings

01

Derived combinatorial dimensions for learnability

02

Extended analysis to multiclass scenarios

03

Analyzed bandit feedback and cost modeling

Abstract

We investigate the recently introduced model of learning with improvements, where agents are allowed to make small changes to their feature values to be warranted a more desirable label. We extensively extend previously published results by providing combinatorial dimensions that characterize online learnability in this model, by analyzing the multiclass setup, learnability in a bandit feedback setup, modeling agents' cost for making improvements and more.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques