Continuous-time Value Function Approximation in Reproducing Kernel   Hilbert Spaces

Motoya Ohnishi; Masahiro Yukawa; Mikael Johansson; Masashi Sugiyama

arXiv:1806.02985·math.OC·December 3, 2018·1 cites

Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces

Motoya Ohnishi, Masahiro Yukawa, Mikael Johansson, Masashi Sugiyama

PDF

Open Access

TL;DR

This paper introduces a flexible, kernel-based framework for continuous-time reinforcement learning that directly models system dynamics, effectively handling uncertainties and nonstationarity without prior environment knowledge.

Contribution

It presents a novel, general framework for continuous-time value function approximation in reproducing kernel Hilbert spaces, accommodating various kernel methods and uncertainties.

Findings

01

Framework effectively models continuous-time dynamics.

02

Handles uncertainties and nonstationarity without prior knowledge.

03

Validated through experimental results.

Abstract

Motivated by the success of reinforcement learning (RL) for discrete-time tasks such as AlphaGo and Atari games, there has been a recent surge of interest in using RL for continuous-time control of physical systems (cf. many challenging tasks in OpenAI Gym and DeepMind Control Suite). Since discretization of time is susceptible to error, it is methodologically more desirable to handle the system dynamics directly in continuous time. However, very few techniques exist for continuous-time RL and they lack flexibility in value function approximation. In this paper, we propose a novel framework for model-based continuous-time value function approximation in reproducing kernel Hilbert spaces. The resulting framework is so flexible that it can accommodate any kind of kernel-based approach, such as Gaussian processes and kernel adaptive filters, and it allows us to handle uncertainties and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Gaussian Processes and Bayesian Inference · Control Systems and Identification