A Kernel Perspective on Behavioural Metrics for Markov Decision   Processes

Pablo Samuel Castro; Tyler Kastner; Prakash Panangaden; Mark Rowland

arXiv:2310.19804·cs.LG·November 1, 2023·2 cites

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

PDF

Open Access

TL;DR

This paper introduces a new kernel-based approach to behavioural metrics in Markov decision processes, providing theoretical guarantees and empirical evidence for improved reinforcement learning representations.

Contribution

It presents a novel kernel perspective on behavioural metrics, establishing equivalence with MICo distance and deriving new theoretical bounds and embedding properties.

Findings

01

Bounded value function differences using the new metric

02

Proved low-distortion Euclidean embedding of the metric

03

Empirical results show improved reinforcement learning performance

Abstract

Behavioural metrics have been shown to be an effective mechanism for constructing representations in reinforcement learning. We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We leverage this new perspective to define a new metric that is provably equivalent to the recently introduced MICo distance (Castro et al., 2021). The kernel perspective further enables us to provide new theoretical results, which has so far eluded prior work. These include bounding value function differences by means of our metric, and the demonstration that our metric can be provably embedded into a finite-dimensional Euclidean space with low distortion error. These are two crucial properties when using behavioural metrics for reinforcement learning representations. We complement our theory with strong empirical results that demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics