Trade-R1: Bridging Verifiable Rewards to Stochastic Environments via Process-Level Reasoning Verification
Rui Sun, Yifan Sun, Sheng Xu, Li Zhao, Jing Li, Daxin Jiang, Cheng Hua, Zuo Bai

TL;DR
This paper introduces Trade-R1, a framework that enhances reinforcement learning in stochastic financial environments by verifying reasoning processes, reducing reward hacking, and improving decision accuracy through structured reasoning verification and novel reward strategies.
Contribution
Trade-R1 presents a new process-level reasoning verification method and reward integration strategies to improve RL performance in noisy, stochastic financial markets.
Findings
Trade-R1 reduces reward hacking in financial decision RL tasks.
Dynamically weighted Semantic Reward (DSR) improves cross-market generalization.
Structured reasoning verification enhances decision validity in stochastic environments.
Abstract
Reinforcement Learning (RL) has enabled Large Language Models (LLMs) to achieve remarkable reasoning in domains like mathematics and coding, where verifiable rewards provide clear signals. However, extending this paradigm to financial decision is challenged by the market's stochastic nature: rewards are verifiable but inherently noisy, causing standard RL to degenerate into reward hacking. To address this, we propose Trade-R1, a model training framework that bridges verifiable rewards to stochastic environments via process-level reasoning verification. Our key innovation is a verification method that transforms the problem of evaluating reasoning over lengthy financial documents into a structured Retrieval-Augmented Generation (RAG) task. We construct a triangular consistency metric, assessing pairwise alignment between retrieved evidence, reasoning chains, and decisions to serve as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Explainable Artificial Intelligence (XAI) · Topic Modeling
