SHARP: A Self-Evolving Human-Auditable Rubric Policy for Financial Trading Agents

Xiwen Chen; Wenhui Zhu; Songzhu Zheng; Kashif Rasul; Yueyue Deng; Huayu Li

arXiv:2605.06822·cs.LG·May 11, 2026

SHARP: A Self-Evolving Human-Auditable Rubric Policy for Financial Trading Agents

Xiwen Chen, Wenhui Zhu, Songzhu Zheng, Kashif Rasul, Yueyue Deng, Huayu Li

PDF

TL;DR

SHARP is a neuro-symbolic framework that improves financial trading agents by using explicit, human-readable rules and targeted edits, leading to more robust and transparent strategies across diverse markets.

Contribution

It introduces a structured, symbolic policy optimization method that enhances robustness, transparency, and adaptability of LLM-based trading agents in noisy, non-stationary environments.

Findings

01

SHARP improves empirical performance by 10-20 percentage points.

02

It transforms generic heuristics into robust, auditable strategies.

03

SHARP maintains structural transparency and auditability in trading policies.

Abstract

Large language models (LLMs) are increasingly deployed for autonomous financial trading, a domain requiring continuous adaptation to noisy, non-stationary markets. Existing self-improving agents typically address this through unbounded free-form prompt optimization. However, in low signal-to-noise environments with delayed scalar rewards (P\&L), this unstructured approach exacerbates the fundamental credit assignment problem: optimizers cannot reliably distinguish systematic logic flaws from stochastic market variance, inevitably leading to policy drift. To overcome this bottleneck, we introduce the Self-Evolving Human-Auditable Rubric Policy (SHARP), a neuro-symbolic framework that replaces unconstrained text mutation with structured, symbolic policy optimization. SHARP confines the agent's reasoning to a bounded, human-readable rubric of explicit condition-action rules. When…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.