DiffsFormer: A Diffusion Transformer on Stock Factor Augmentation
Yuan Gao, Haokun Chen, Xiang Wang, Zhicai Wang, Xue Wang, Jinyang Gao,, Bolin Ding

TL;DR
This paper introduces DiffsFormer, a diffusion transformer model that generates artificial stock factors to augment training data, significantly improving forecasting performance on stock datasets by addressing data scarcity issues.
Contribution
We propose DiffsFormer, a novel diffusion transformer architecture that uses AI-generated samples to enhance stock forecasting models under data scarcity conditions.
Findings
Achieved 7.2% and 27.8% improvements in annualized return ratios on two datasets.
Demonstrated the effectiveness of AI-generated samples in mitigating data scarcity.
Provided insights into the components of DiffsFormer and their roles in performance enhancement.
Abstract
Machine learning models have demonstrated remarkable efficacy and efficiency in a wide range of stock forecasting tasks. However, the inherent challenges of data scarcity, including low signal-to-noise ratio (SNR) and data homogeneity, pose significant obstacles to accurate forecasting. To address this issue, we propose a novel approach that utilizes artificial intelligence-generated samples (AIGS) to enhance the training procedures. In our work, we introduce the Diffusion Model to generate stock factors with Transformer architecture (DiffsFormer). DiffsFormer is initially trained on a large-scale source domain, incorporating conditional guidance so as to capture global joint distribution. When presented with a specific downstream task, we employ DiffsFormer to augment the training procedure by editing existing samples. This editing step allows us to control the strength of the editing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Markets and Investment Strategies · Stock Market Forecasting Methods
MethodsPosition-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Diffusion · Absolute Position Encodings · Softmax · Byte Pair Encoding · Linear Layer · Attention Is All You Need · Dropout
