InvestAlign: Overcoming Data Scarcity in Aligning Large Language Models with Investor Decision-Making Processes under Herd Behavior

Huisheng Wang; Zhuoshi Pan; Hangjing Zhang; Mingxiao Liu; Hanqing Gao; H. Vicky Zhao

arXiv:2507.06528·cs.CL·July 10, 2025

InvestAlign: Overcoming Data Scarcity in Aligning Large Language Models with Investor Decision-Making Processes under Herd Behavior

Huisheng Wang, Zhuoshi Pan, Hangjing Zhang, Mingxiao Liu, Hanqing Gao, H. Vicky Zhao

PDF

Open Access 1 Repo

TL;DR

InvestAlign introduces a novel framework that uses theoretical solutions to generate high-quality fine-tuning data, enabling LLMs to better mimic investor decision-making under herd behavior with less reliance on scarce real-user data.

Contribution

The paper presents InvestAlign, a new method for constructing effective training datasets using theoretical investment models, improving LLM alignment with investor behavior while reducing data collection costs.

Findings

01

InvestAlign-generated data leads to faster LLM training convergence.

02

InvestAgent fine-tuned with InvestAlign data aligns more closely with real-user decisions.

03

The approach effectively handles both simple and complex investment scenarios.

Abstract

Aligning Large Language Models (LLMs) with investor decision-making processes under herd behavior is a critical challenge in behavioral finance, which grapples with a fundamental limitation: the scarcity of real-user data needed for Supervised Fine-Tuning (SFT). While SFT can bridge the gap between LLM outputs and human behavioral patterns, its reliance on massive authentic data imposes substantial collection costs and privacy risks. We propose InvestAlign, a novel framework that constructs high-quality SFT datasets by leveraging theoretical solutions to similar and simple optimal investment problems rather than complex scenarios. Our theoretical analysis demonstrates that training LLMs with InvestAlign-generated data achieves faster parameter convergence than using real-user data, suggesting superior learning efficiency. Furthermore, we develop InvestAgent, an LLM agent fine-tuned with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thu-social-network-research-group/InvestAlign
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFinTech, Crowdfunding, Digital Finance · Topic Modeling · Explainable Artificial Intelligence (XAI)

MethodsShrink and Fine-Tune · ALIGN