Auction-Based Online Policy Adaptation for Evolving Objectives
Guruprerana Shabadi, Kaushik Mallik

TL;DR
This paper introduces an auction-based modular framework for multi-objective reinforcement learning that adapts dynamically to changing objectives, enabling efficient and interpretable policy adjustments in real-time.
Contribution
It proposes a novel auction mechanism for coordinating local policies in multi-objective RL, allowing immediate adaptation and improved performance over monolithic policies.
Findings
The method outperforms monolithic PPO-trained policies in experiments.
Policies can adapt instantly to objective changes by adding or removing local policies.
Theoretical analysis guarantees the existence of Nash equilibria with desirable properties.
Abstract
We consider multi-objective reinforcement learning problems where objectives come from an identical family -- such as the class of reachability objectives -- and may appear or disappear at runtime. Our goal is to design adaptive policies that can efficiently adjust their behaviors as the set of active objectives changes. To solve this problem, we propose a modular framework where each objective is supported by a selfish local policy, and coordination is achieved through a novel auction-based mechanism: policies bid for the right to execute their actions, with bids reflecting the urgency of the current state. The highest bidder selects the action, enabling a dynamic and interpretable trade-off among objectives. Going back to the original adaptation problem, when objectives change, the system adapts by simply adding or removing the corresponding policies. Moreover, as objectives arise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
