Koopman-based surrogate modeling for reinforcement-learning-control of Rayleigh-Benard convection

Tim Plotzki; Sebastian Peitz

arXiv:2603.28074·cs.LG·March 31, 2026

Koopman-based surrogate modeling for reinforcement-learning-control of Rayleigh-Benard convection

Tim Plotzki, Sebastian Peitz

PDF

TL;DR

This paper explores using Koopman-based surrogate models, specifically LRANs, to accelerate reinforcement learning control of Rayleigh-Benard convection, reducing computational costs and improving training efficiency.

Contribution

It introduces a policy-aware surrogate training method that mitigates distribution shift, enhancing RL control performance with significant computational savings.

Findings

01

Surrogate-only training reduces control performance.

02

Policy-aware training improves prediction accuracy in relevant regions.

03

Pretraining with surrogates and DNS achieves state-of-the-art results with 40% less training time.

Abstract

Training reinforcement learning (RL) agents to control fluid dynamics systems is computationally expensive due to the high cost of direct numerical simulations (DNS) of the governing equations. Surrogate models offer a promising alternative by approximating the dynamics at a fraction of the computational cost, but their feasibility as training environments for RL is limited by distribution shifts, as policies induce state distributions not covered by the surrogate training data. In this work, we investigate the use of Linear Recurrent Autoencoder Networks (LRANs) for accelerating RL-based control of 2D Rayleigh-B\'enard convection. We evaluate two training strategies: a surrogate trained on precomputed data generated with random actions, and a policy-aware surrogate trained iteratively using data collected from an evolving policy. Our results show that while surrogate-only training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.