MM-KTD: Multiple Model Kalman Temporal Differences for Reinforcement   Learning

Parvin Malekzadeh; Mohammad Salimibeni; Arash Mohammadi; Akbar Assa,; and Konstantinos N. Plataniotis

arXiv:2006.00195·cs.LG·June 2, 2020

MM-KTD: Multiple Model Kalman Temporal Differences for Reinforcement Learning

Parvin Malekzadeh, Mohammad Salimibeni, Arash Mohammadi, Akbar Assa,, and Konstantinos N. Plataniotis

PDF

1 Repo

TL;DR

This paper introduces MM-KTD, a novel reinforcement learning method combining multiple model Kalman filters with active learning to improve sample efficiency and policy learning in continuous state spaces.

Contribution

The paper proposes the MM-KTD framework that adaptively tunes Kalman filter parameters and employs active learning to enhance sample efficiency in RL.

Findings

01

MM-KTD outperforms deep neural network-based methods in sample efficiency.

02

The active learning component improves exploration of uncertain states.

03

Experimental results demonstrate superior performance on benchmark tasks.

Abstract

There has been an increasing surge of interest on development of advanced Reinforcement Learning (RL) systems as intelligent approaches to learn optimal control policies directly from smart agents' interactions with the environment. Objectives: In a model-free RL method with continuous state-space, typically, the value function of the states needs to be approximated. In this regard, Deep Neural Networks (DNNs) provide an attractive modeling mechanism to approximate the value function using sample transitions. DNN-based solutions, however, suffer from high sensitivity to parameter selection, are prone to overfitting, and are not very sample efficient. A Kalman-based methodology, on the other hand, could be used as an efficient alternative. Such an approach, however, commonly requires a-priori information about the system (such as noise statistics) to perform efficiently. The main…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pmalekzadeh/MM-KTD
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.