Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model

Zesheng Yao; Zhen-Hua Wan; Canjun Yang; Qingchao Xia; Mengqi Zhang

arXiv:2604.04986·cs.LG·April 8, 2026

Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model

Zesheng Yao, Zhen-Hua Wan, Canjun Yang, Qingchao Xia, Mengqi Zhang

PDF

TL;DR

This paper introduces an adaptive reduced-order-model (ROM) framework for reinforcement learning in flow control, significantly improving sample efficiency by replacing the critic with a physically-informed, data-driven ROM.

Contribution

It develops a novel ROM-based reinforcement learning approach that adaptively updates the model during interactions, enhancing sample efficiency in flow control tasks.

Findings

01

Outperforms traditional linear designs in boundary layer control.

02

Achieves superior drag reduction with less data in flow past a square cylinder.

03

Requires fewer exploration data compared to standard DRL methods.

Abstract

Model-free deep reinforcement learning (DRL) methods suffer from poor sample efficiency. To overcome this limitation, this work introduces an adaptive reduced-order-model (ROM)-based reinforcement learning framework for active flow control. In contrast to conventional actor--critic architectures, the proposed approach leverages a ROM to estimate the gradient information required for controller optimization. The design of the ROM structure incorporates physical insights. The ROM integrates a linear dynamical system and a neural ordinary differential equation (NODE) for estimating the nonlinearity in the flow. The parameters of the linear component are identified via operator inference, while the NODE is trained in a data-driven manner using gradient-based optimization. During controller--environment interactions, the ROM is continuously updated with newly collected data, enabling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.