Toward Discovering Options that Achieve Faster Planning
Yi Wan, Richard S. Sutton

TL;DR
This paper introduces a new objective for option discovery that focuses on reducing planning computation time by selecting options that enable faster policy computation, demonstrated through an algorithm and experiments in a four-room domain.
Contribution
It proposes a novel objective for option discovery emphasizing computational efficiency and develops an algorithm that optimizes this objective for faster planning.
Findings
Higher objective values correlate with fewer planning operations.
The algorithm matches the performance of human-designed options.
Discovered options are intuitive and effective for planning.
Abstract
We propose a new objective for option discovery that emphasizes the computational advantage of using options in planning. In a sequential machine, the speed of planning is proportional to the number of elementary operations used to achieve a good policy. For episodic tasks, the number of elementary operations depends on the number of options composed by the policy in an episode and the number of options being considered at each decision point. To reduce the amount of computation in planning, for a given set of episodic tasks and a given number of options, our objective prefers options with which it is possible to achieve a high return by composing few options, and also prefers a smaller set of options to choose from at each decision point. We develop an algorithm that optimizes the proposed objective. In a variant of the classic four-room domain, we show that 1) a higher objective value…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReservoir Engineering and Simulation Methods · Auction Theory and Applications · Reinforcement Learning in Robotics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
