Decomposable Reward Modeling and Realistic Environment Design for Reinforcement Learning-Based Forex Trading
Nabeel Ahmad Saidd

TL;DR
This paper introduces a modular RL framework for Forex trading that emphasizes realistic environment modeling, decomposable rewards, and explicit trading actions, leading to improved training dynamics and performance.
Contribution
The paper presents a novel RL framework with a friction-aware environment, a decomposable reward architecture, and a discrete action interface tailored for Forex trading applications.
Findings
Full reward configuration achieves highest Sharpe ratio and cumulative return.
Expanded action space increases returns but also turnover and reduces Sharpe ratio.
Scaling variants reduce drawdown and improve endpoint performance.
Abstract
Applying reinforcement learning (RL) to foreign exchange (Forex) trading remains challenging because realistic environments, well-defined reward functions, and expressive action spaces must be satisfied simultaneously, yet many prior studies rely on simplified simulators, single scalar rewards, and restricted action representations, limiting both interpretability and practical relevance. This paper presents a modular RL framework designed to address these limitations through three tightly integrated components: a friction-aware execution engine that enforces strict anti-lookahead semantics, with observations at time t, execution at time t+1, and mark-to-market at time t+1, while incorporating realistic costs such as spread, commission, slippage, rollover financing, and margin-triggered liquidation; a decomposable 11-component reward architecture with fixed weights and per-step…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
