Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints
Shubham Vaishnav, Praveen Kumar Donta, Sindri Magn\'usson

TL;DR
This paper introduces a novel adaptive learning framework for IoT devices that manages fluctuating resource constraints over time, balancing performance and compliance through a decaying violation budget.
Contribution
It proposes a Budgeted UCB algorithm with theoretical guarantees for sublinear regret and constraint violations in dynamic IoT environments.
Findings
Faster adaptation in wireless communication simulations
Better constraint satisfaction than standard methods
Achieves sublinear regret and logarithmic violations
Abstract
Internet of Things (IoT) systems increasingly operate in environments where devices must respond in real time while managing fluctuating resource constraints, including energy and bandwidth. Yet, current approaches often fall short in addressing scenarios where operational constraints evolve over time. To address these limitations, we propose a novel Budgeted Multi-Armed Bandit framework tailored for IoT applications with dynamic operational limits. Our model introduces a decaying violation budget, which permits limited constraint violations early in the learning process and gradually enforces stricter compliance over time. We present the Budgeted Upper Confidence Bound (UCB) algorithm, which adaptively balances performance optimization and compliance with time-varying constraints. We provide theoretical guarantees showing that Budgeted UCB achieves sublinear regret and logarithmic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
