Moira: Language-driven Hierarchical Reinforcement Learning for Pair Trading
Polydoros Giannouris, Yuechen Jiang, Lingfei Qian, Yuyan Wang, Xueqing Peng, Jimin Huang, Guojun Xiong, Sophia Ananiadou

TL;DR
This paper introduces Moira, a hierarchical reinforcement learning framework for pair trading that uses large language models to guide decision-making and adapt strategies through textual feedback, improving performance in real-world markets.
Contribution
The paper presents a novel language-driven hierarchical RL approach that leverages pretrained LLMs for both high-level and low-level policies without gradient fine-tuning.
Findings
Moira outperforms traditional baselines on real-world market data.
Language-driven adaptation improves decision quality under delayed feedback.
Explicit hierarchy separation reduces non-stationarity in trading strategies.
Abstract
Many sequential decision-making problems exhibit hierarchical structure, where high-level semantic choices constrain downstream actions and feedback is delayed and ambiguous. Learning in such settings is challenging due to credit assignment: performance degradation may arise from flawed abstractions, suboptimal execution, or their interaction. We study this challenge through pair trading, a domain that naturally combines long-horizon semantic reasoning for asset pair selection with short-horizon execution under partial observability. We formulate pair trading as a hierarchical reinforcement learning problem and propose a language-driven optimization framework in which both high-level and low-level policies are parameterized by large language models (LLMs) and optimized exclusively through prompt updates. Our approach leverages pretrained LLMs as hierarchical policies and uses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
