Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing
Jingwei Ji, Renyuan Xu, Ruihao Zhu

TL;DR
This paper introduces risk-aware algorithms for linear bandits in financial decision-making, demonstrating their theoretical near-optimal regret bounds and superior empirical performance in smart order routing tasks.
Contribution
The paper proposes two novel risk-aware algorithms, RISE and RISE++, with rigorous regret analysis and empirical validation in real financial datasets.
Findings
RISE and RISE++ outperform existing methods in regret minimization.
Linear structure is well supported by NASDAQ dataset.
Algorithms perform especially well in complex decision scenarios.
Abstract
Motivated by practical considerations in machine learning for financial decision-making, such as risk aversion and large action space, we consider risk-aware bandits optimization with applications in smart order routing (SOR). Specifically, based on preliminary observations of linear price impacts made from the NASDAQ ITCH dataset, we initiate the study of risk-aware linear bandits. In this setting, we aim at minimizing regret, which measures our performance deficit compared to the optimum's, under the mean-variance metric when facing a set of actions whose rewards are linear functions of (initially) unknown parameters. Driven by the variance-minimizing globally-optimal (G-optimal) design, we propose the novel instance-independent Risk-Aware Explore-then-Commit (RISE) algorithm and the instance-dependent Risk-Aware Successive Elimination (RISE++) algorithm. Then, we rigorously analyze…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
