Optimized Look-Ahead Tree Policies: A Bridge Between Look-Ahead Tree Policies and Direct Policy Search
Tobias Jung, Louis Wehenkel, Damien Ernst, Francis Maes

TL;DR
This paper introduces a hybrid policy learning method that combines look-ahead tree policies with direct policy search, resulting in more efficient, robust, and high-performing policies for sequential decision-making problems.
Contribution
It proposes a novel hybrid scheme that guides small look-ahead trees using learned node scoring functions, bridging DPS and LT approaches effectively.
Findings
Outperforms pure DPS and LT policies on benchmark domains.
Requires fewer policy evaluations than existing DPS methods.
Produces robust policies with easy tuning.
Abstract
Direct policy search (DPS) and look-ahead tree (LT) policies are two widely used classes of techniques to produce high performance policies for sequential decision-making problems. To make DPS approaches work well, one crucial issue is to select an appropriate space of parameterized policies with respect to the targeted problem. A fundamental issue in LT approaches is that, to take good decisions, such policies must develop very large look-ahead trees which may require excessive online computational resources. In this paper, we propose a new hybrid policy learning scheme that lies at the intersection of DPS and LT, in which the policy is an algorithm that develops a small look-ahead tree in a directed way, guided by a node scoring function that is learned through DPS. The LT-based representation is shown to be a versatile way of representing policies in a DPS scheme, while at the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Reinforcement Learning in Robotics · Machine Learning and Algorithms
