Learning Efficient Representations for Reinforcement Learning

Yanping Huang

arXiv:1509.02413·cs.AI·September 9, 2015

Learning Efficient Representations for Reinforcement Learning

Yanping Huang

PDF

Open Access

TL;DR

This paper proposes a method for automatically constructing structured kernels for kernel-based reinforcement learning, aiming to improve the efficiency and compactness of value function approximation in large or continuous state spaces.

Contribution

It introduces a compositional approach to kernel structure search using a context-free grammar and a greedy algorithm, advancing automatic representation learning in RL.

Findings

01

Demonstrates the method on synthetic problems

02

Shows improved compactness over baseline methods

03

Plans for comparative evaluation with RL baselines

Abstract

Markov decision processes (MDPs) are a well studied framework for solving sequential decision making problems under uncertainty. Exact methods for solving MDPs based on dynamic programming such as policy iteration and value iteration are effective on small problems. In problems with a large discrete state space or with continuous state spaces, a compact representation is essential for providing an efficient approximation solutions to MDPs. Commonly used approximation algorithms involving constructing basis functions for projecting the value function onto a low dimensional subspace, and building a factored or hierarchical graphical model to decompose the transition and reward functions. However, hand-coding a good compact representation for a given reinforcement learning (RL) task can be quite difficult and time consuming. Recent approaches have attempted to automatically discover…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Software Reliability and Analysis Research