Automatic feature identification in least-squares policy iteration using the Koopman operator framework

Christian Mugisho Zagabe; Sebastian Peitz

arXiv:2603.26464·cs.LG·March 30, 2026

Automatic feature identification in least-squares policy iteration using the Koopman operator framework

Christian Mugisho Zagabe, Sebastian Peitz

PDF

TL;DR

This paper introduces a novel reinforcement learning algorithm that automatically learns features using a Koopman autoencoder, improving upon traditional methods by removing the need for fixed features or kernels.

Contribution

The paper presents the KAE-LSPI algorithm, which reformulates least-squares policy iteration through EDMD, enabling automatic feature learning without predefined kernels.

Findings

01

KAE-LSPI learns a reasonable number of features compared to classical LSPI.

02

Convergence to near-optimal policies is comparable to existing methods.

03

Empirical results demonstrate effective automatic feature learning.

Abstract

In this paper, we present a Koopman autoencoder-based least-squares policy iteration (KAE-LSPI) algorithm in reinforcement learning (RL). The KAE-LSPI algorithm is based on reformulating the so-called least-squares fixed-point approximation method in terms of extended dynamic mode decomposition (EDMD), thereby enabling automatic feature learning via the Koopman autoencoder (KAE) framework. The approach is motivated by the lack of a systematic choice of features or kernels in linear RL techniques. We compare the KAE-LSPI algorithm with two previous works, the classical least-squares policy iteration (LSPI) and the kernel-based least-squares policy iteration (KLSPI), using stochastic chain walk and inverted pendulum control problems as examples. Unlike previous works, no features or kernels need to be fixed a priori in our approach. Empirical results show the number of features learned by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.