KIPPO: Koopman-Inspired Proximal Policy Optimization

Andrei Cozma; Landon Harris; Hairong Qi

arXiv:2505.14566·cs.LG·May 21, 2025

KIPPO: Koopman-Inspired Proximal Policy Optimization

Andrei Cozma, Landon Harris, Hairong Qi

PDF

Open Access

TL;DR

KIPPO introduces a novel approach combining Koopman operator theory with PPO to learn linear latent representations of complex dynamics, leading to more stable and higher-performing policies in continuous control tasks.

Contribution

The paper proposes KIPPO, a method that integrates Koopman-inspired linear approximations into PPO, enhancing stability and performance without changing the core architecture.

Findings

01

Achieves 6-60% performance improvements over PPO baseline.

02

Reduces learning variability by up to 91%.

03

Demonstrates effectiveness across various continuous control tasks.

Abstract

Reinforcement Learning (RL) has made significant strides in various domains, and policy gradient methods like Proximal Policy Optimization (PPO) have gained popularity due to their balance in performance, training stability, and computational efficiency. These methods directly optimize policies through gradient-based updates. However, developing effective control policies for environments with complex and non-linear dynamics remains a challenge. High variance in gradient estimates and non-convex optimization landscapes often lead to unstable learning trajectories. Koopman Operator Theory has emerged as a powerful framework for studying non-linear systems through an infinite-dimensional linear operator that acts on a higher-dimensional space of measurement functions. In contrast with their non-linear counterparts, linear systems are simpler, more predictable, and easier to analyze. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsEntropy Regularization · Proximal Policy Optimization