Decision Making in Hybrid Environments: A Model Aggregation Approach
Haolin Liu, Chen-Yu Wei, Julian Zimmert

TL;DR
This paper extends the decision estimation coefficient framework to hybrid environments with fixed dynamics and arbitrary reward changes, enabling more precise modeling and improved regret bounds in reinforcement learning.
Contribution
It introduces a new extension of DEC for hybrid regimes, facilitating flexible algorithm design and covering both model-based and model-free learning scenarios.
Findings
Provides a more accurate characterization of hybrid decision environments.
Develops a flexible algorithm that learns over subsets of hypotheses.
Improves regret bounds for linear Q*/V* MDPs in stochastic settings.
Abstract
Recent work by Foster et al. (2021, 2022, 2023b) and Xu and Zeevi (2023) developed the framework of decision estimation coefficient (DEC) that characterizes the complexity of general online decision making problems and provides a general algorithm design principle. These works, however, either focus on the pure stochastic regime where the world remains fixed over time, or the pure adversarial regime where the world arbitrarily changes over time. For the hybrid regime where the dynamics of the world is fixed while the reward arbitrarily changes, they only give pessimistic bounds on the decision complexity. In this work, we propose a general extension of DEC that more precisely characterizes this case. Besides applications in special cases, our framework leads to a flexible algorithm design where the learner learns over subsets of the hypothesis set, trading estimation complexity with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
MethodsFocus
