Law-Strength Frontiers and a No-Free-Lunch Result for Law-Seeking Reinforcement Learning on Volatility Law Manifolds
Jian'an Zhang

TL;DR
This paper investigates reinforcement learning on volatility surfaces, revealing fundamental limitations of law regularization in aligning high-capacity RL agents with no-arbitrage principles, and demonstrating that law-seeking RL cannot outperform structural strategies in this context.
Contribution
It introduces a formal framework for law constraints on volatility surfaces, proves theoretical limitations of law-seeking RL, and empirically shows the inefficacy of such approaches compared to structural strategies.
Findings
Law-seeking RL underperforms structural strategies on P extbar L and GFI.
Stronger penalties can worsen RL performance due to ghost arbitrage.
A no-free-lunch theorem shows fundamental limits of law regularization in this setting.
Abstract
We study reinforcement learning (RL) on volatility surfaces through the lens of Scientific AI. We ask whether axiomatic no-arbitrage laws, imposed as soft penalties on a learned world model, can reliably align high-capacity RL agents, or mainly create Goodhart-style incentives to exploit model errors. From classical static no-arbitrage conditions we build a finite-dimensional convex volatility law manifold of admissible total-variance surfaces, together with a metric law-penalty functional and a Graceful Failure Index (GFI) that normalizes law degradation under shocks. A synthetic generator produces law-consistent trajectories, while a recurrent neural world model trained without law regularization exhibits structured off-manifold errors. On this testbed we define a Goodhart decomposition \(r = r^{\mathcal{M}} + r^\perp\), where \(r^\perp\) is ghost arbitrage from off-manifold…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Stock Market Forecasting Methods · Model Reduction and Neural Networks
