Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies
Sinan Ibrahim, Gr\'egoire Ouerdane, Hadi Salloum, Henni Ouerdane, Stefan Streif, Pavel Osinenko

TL;DR
This paper introduces a rigorous benchmarking framework for reinforcement learning that constructs controlled environments with known optimal policies, enabling precise evaluation of algorithms across diverse scenarios.
Contribution
It extends converse optimality to nonlinear stochastic systems, allowing systematic generation of benchmark environments with guaranteed optimal policies.
Findings
Framework enables controlled environment generation
Demonstrates capacity for comprehensive RL evaluation
Provides a reproducible basis for benchmarking
Abstract
The objective comparison of Reinforcement Learning (RL) algorithms is notoriously complex as outcomes and benchmarking of performances of different RL approaches are critically sensitive to environmental design, reward structures, and stochasticity inherent in both algorithmic learning and environmental dynamics. To manage this complexity, we introduce a rigorous benchmarking framework by extending converse optimality to discrete-time, control-affine, nonlinear systems with noise. Our framework provides necessary and sufficient conditions, under which a prescribed value function and policy are optimal for constructed systems, enabling the systematic generation of benchmark families via homotopy variations and randomized parameters. We validate it by automatically constructing diverse environments, demonstrating our framework's capacity for a controlled and comprehensive evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Evolutionary Algorithms and Applications
