Law-Strength Frontiers and a No-Free-Lunch Result for Law-Seeking Reinforcement Learning on Volatility Law Manifolds

Jian'an Zhang

arXiv:2511.17304·q-fin.CP·November 24, 2025

Law-Strength Frontiers and a No-Free-Lunch Result for Law-Seeking Reinforcement Learning on Volatility Law Manifolds

Jian'an Zhang

PDF

Open Access

TL;DR

This paper investigates reinforcement learning on volatility surfaces, revealing fundamental limitations of law regularization in aligning high-capacity RL agents with no-arbitrage principles, and demonstrating that law-seeking RL cannot outperform structural strategies in this context.

Contribution

It introduces a formal framework for law constraints on volatility surfaces, proves theoretical limitations of law-seeking RL, and empirically shows the inefficacy of such approaches compared to structural strategies.

Findings

01

Law-seeking RL underperforms structural strategies on P extbar L and GFI.

02

Stronger penalties can worsen RL performance due to ghost arbitrage.

03

A no-free-lunch theorem shows fundamental limits of law regularization in this setting.

Abstract

We study reinforcement learning (RL) on volatility surfaces through the lens of Scientific AI. We ask whether axiomatic no-arbitrage laws, imposed as soft penalties on a learned world model, can reliably align high-capacity RL agents, or mainly create Goodhart-style incentives to exploit model errors. From classical static no-arbitrage conditions we build a finite-dimensional convex volatility law manifold of admissible total-variance surfaces, together with a metric law-penalty functional and a Graceful Failure Index (GFI) that normalizes law degradation under shocks. A synthetic generator produces law-consistent trajectories, while a recurrent neural world model trained without law regularization exhibits structured off-manifold errors. On this testbed we define a Goodhart decomposition \(r = r^{\mathcal{M}} + r^\perp\), where \(r^\perp\) is ghost arbitrage from off-manifold…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Stock Market Forecasting Methods · Model Reduction and Neural Networks