Conditional Value-at-Risk for Quantitative Trading: A Direct Reinforcement Learning Approach
Ali Al-Ameer, Khaled Alshehri

TL;DR
This paper introduces a convex, online reinforcement learning method for trading that optimizes Conditional Value-at-Risk, demonstrating robustness and profitability in real market conditions with transaction costs.
Contribution
It presents a novel convex formulation for risk-adjusted trading policy learning using direct reinforcement learning, enabling online updates without multi-epoch training.
Findings
Successfully applied to real market data over three years
Effectively detects market regime switches
Achieves profitable trading under risk constraints
Abstract
We propose a convex formulation for a trading system with the Conditional Value-at-Risk as a risk-adjusted performance measure under the notion of Direct Reinforcement Learning. Due to convexity, the proposed approach can uncover a lucrative trading policy in a "pure" online manner where it can interactively learn and update the policy without multi-epoch training and validation. We assess our proposed algorithm on a real financial market where it trades one of the largest US trust funds, SPDR, for three years. Numerical experiments demonstrate the algorithm's robustness in detecting central market-regime switching. Moreover, the results show the algorithm's effectiveness in extracting profitable policy while meeting an investor's risk preference under a conservative frictional market with a transaction cost of 0.15% per trade.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Markets and Investment Strategies · Advanced Bandit Algorithms Research · Stock Market Forecasting Methods
