Multifidelity Reinforcement Learning with Control Variates
Sami Khairy, Prasanna Balaprakash

TL;DR
This paper introduces a multifidelity reinforcement learning approach that leverages low- and high-fidelity data through control variates to improve policy learning efficiency and performance.
Contribution
It proposes a novel multifidelity estimator based on control variates and develops the MFMCRL algorithm, enhancing RL performance with limited high-fidelity data.
Findings
MFMCRL outperforms standard RL with limited high-fidelity data.
Variance reduction improves policy evaluation and learning.
Theoretical bounds support the effectiveness of the approach.
Abstract
In many computational science and engineering applications, the output of a system of interest corresponding to a given input can be queried at different levels of fidelity with different costs. Typically, low-fidelity data is cheap and abundant, while high-fidelity data is expensive and scarce. In this work we study the reinforcement learning (RL) problem in the presence of multiple environments with different levels of fidelity for a given control task. We focus on improving the RL agent's performance with multifidelity data. Specifically, a multifidelity estimator that exploits the cross-correlations between the low- and high-fidelity returns is proposed to reduce the variance in the estimation of the state-action value function. The proposed estimator, which is based on the method of control variates, is used to design a multifidelity Monte Carlo RL (MFMCRL) algorithm that improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuction Theory and Applications · Energy Efficiency and Management
