Improved Algorithms for Nash Welfare in Linear Bandits
Dhruv Sarkar, Nishant Pandey, Sayak Ray Chowdhury

TL;DR
This paper introduces new analytical tools to achieve order-optimal Nash regret bounds in linear bandits and proposes a flexible framework for fairness-aware regret minimization, extending to p-means regret with strong empirical results.
Contribution
It provides the first order-optimal Nash regret bounds for linear bandits and introduces a general framework for p-means regret, unifying fairness and utility objectives.
Findings
Achieves order-optimal Nash regret bounds in linear bandits.
Proposes FairLinBandit, a meta-algorithm for fairness-aware regret minimization.
Demonstrates superior empirical performance over existing baselines.
Abstract
Nash regret has recently emerged as a principled fairness-aware performance metric for stochastic multi-armed bandits, motivated by the Nash Social Welfare objective. Although this notion has been extended to linear bandits, existing results suffer from suboptimality in ambient dimension , stemming from proof techniques that rely on restrictive concentration inequalities. In this work, we resolve this open problem by introducing new analytical tools that yield an order-optimal Nash regret bound in linear bandits. Beyond Nash regret, we initiate the study of -means regret in linear bandits, a unifying framework that interpolates between fairness and utility objectives and strictly generalizes Nash regret. We propose a generic algorithmic framework, FairLinBandit, that works as a meta-algorithm on top of any linear bandit strategy. We instantiate this framework using two bandit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Ethics and Social Impacts of AI · Mobile Crowdsensing and Crowdsourcing
