SHARP: A Self-Evolving Human-Auditable Rubric Policy for Financial Trading Agents
Xiwen Chen, Wenhui Zhu, Songzhu Zheng, Kashif Rasul, Yueyue Deng, Huayu Li

TL;DR
SHARP is a neuro-symbolic framework that improves financial trading agents by using explicit, human-readable rules and targeted edits, leading to more robust and transparent strategies across diverse markets.
Contribution
It introduces a structured, symbolic policy optimization method that enhances robustness, transparency, and adaptability of LLM-based trading agents in noisy, non-stationary environments.
Findings
SHARP improves empirical performance by 10-20 percentage points.
It transforms generic heuristics into robust, auditable strategies.
SHARP maintains structural transparency and auditability in trading policies.
Abstract
Large language models (LLMs) are increasingly deployed for autonomous financial trading, a domain requiring continuous adaptation to noisy, non-stationary markets. Existing self-improving agents typically address this through unbounded free-form prompt optimization. However, in low signal-to-noise environments with delayed scalar rewards (P\&L), this unstructured approach exacerbates the fundamental credit assignment problem: optimizers cannot reliably distinguish systematic logic flaws from stochastic market variance, inevitably leading to policy drift. To overcome this bottleneck, we introduce the Self-Evolving Human-Auditable Rubric Policy (SHARP), a neuro-symbolic framework that replaces unconstrained text mutation with structured, symbolic policy optimization. SHARP confines the agent's reasoning to a bounded, human-readable rubric of explicit condition-action rules. When…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
