Inferring Transition Dynamics from Value Functions

Jacob Adamczyk

arXiv:2501.09081·cs.LG·January 17, 2025

Inferring Transition Dynamics from Value Functions

Jacob Adamczyk

PDF

Open Access

TL;DR

This paper demonstrates that converged value functions in reinforcement learning inherently encode information about environment dynamics, enabling the inference of transition models directly from value functions, thus bridging model-free and model-based approaches.

Contribution

It introduces a novel method to infer environment dynamics from value functions and discusses conditions for the identifiability of these models, providing a theoretical foundation for this approach.

Findings

01

Value functions contain implicit environment dynamics information.

02

A simple method to infer transition models from value functions.

03

Discussion of conditions for dynamics model identifiability.

Abstract

In reinforcement learning, the value function is typically trained to solve the Bellman equation, which connects the current value to future values. This temporal dependency hints that the value function may contain implicit information about the environment's transition dynamics. By rearranging the Bellman equation, we show that a converged value function encodes a model of the underlying dynamics of the environment. We build on this insight to propose a simple method for inferring dynamics models directly from the value function, potentially mitigating the need for explicit model learning. Furthermore, we explore the challenges of next-state identifiability, discussing conditions under which the inferred dynamics model is well-defined. Our work provides a theoretical foundation for leveraging value functions in dynamics modeling and opens a new avenue for bridging model-free and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Time Series Analysis