From Hypotheses to Factors: Constrained LLM Agents in Cryptocurrency Markets
Yikuan Huang, Zheqi Fan, Kaiqi Hu, Yifan Ye

TL;DR
This paper introduces a reproducible framework for using LLM agents to discover and test financial factors in cryptocurrency markets, emphasizing auditability and out-of-sample performance.
Contribution
It presents a structured protocol for hypothesis-driven factor discovery with LLM agents, ensuring reproducibility, auditability, and robust out-of-sample evaluation.
Findings
A ridge-combined portfolio achieved 44.55% annualized return on 2020-2022 data.
The portfolio maintained a Sharpe ratio of 1.55 in the 2024-2026 out-of-sample period.
The framework enforces fixed data splits, transaction costs, and portfolio tests for reproducibility.
Abstract
LLM agents are promising tools for empirical discovery, but their flexibility can also turn discovery into uncontrolled search. We study how to use agents under a reproducible protocol through cryptocurrency factor discovery. Our framework casts the task as sequential hypothesis search: an agent reads an append-only experiment trace, proposes falsifiable factor hypotheses, and maps them to executable recipes, while a deterministic engine enforces fixed data splits, selection gates, transaction costs, and portfolio tests. Candidate actions are restricted to a point-in-time factor DSL, making both successful and failed hypotheses auditable. A ridge-combined portfolio trained only on 2020--2022 data achieves a 44.55% annualized return and Sharpe ratio of 1.55 in the 2024--2026 pure out-of-sample period after a 5 basis point one-way trading cost.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
