Marginal Utility for Planning in Continuous or Large Discrete Action Spaces
Zaheen Farraz Ahmad, Levi H. S. Lelis, Michael Bowling

TL;DR
This paper introduces a novel approach for generating candidate actions in sample-based planning by learning a generator optimized for marginal utility, improving planning efficiency in large or continuous action spaces.
Contribution
The paper proposes a new marginal utility objective for training action generators, outperforming traditional methods and domain knowledge in complex planning tasks.
Findings
Generator trained with marginal utility outperforms hand-coded schemes.
Approach effective in both stochastic continuous and large discrete action spaces.
Method improves sample-based planning efficiency.
Abstract
Sample-based planning is a powerful family of algorithms for generating intelligent behavior from a model of the environment. Generating good candidate actions is critical to the success of sample-based planners, particularly in continuous or large action spaces. Typically, candidate action generation exhausts the action space, uses domain knowledge, or more recently, involves learning a stochastic policy to provide such search guidance. In this paper we explore explicitly learning a candidate action generator by optimizing a novel objective, marginal utility. The marginal utility of an action generator measures the increase in value of an action over previously generated actions. We validate our approach in both curling, a challenging stochastic domain with continuous state and action spaces, and a location game with a discrete but large action space. We show that a generator trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · AI-based Problem Solving and Planning
