The Limits of Complexity: Why Feature Engineering Beats Deep Learning in Investor Flow Prediction
Sungwoo Kang

TL;DR
This study demonstrates that in financial prediction, simple feature engineering often outperforms complex deep learning models, especially in low signal-to-noise environments, highlighting the importance of domain knowledge.
Contribution
The paper provides empirical evidence that traditional feature engineering can surpass advanced deep learning techniques in investor flow prediction tasks.
Findings
Simple linear models with domain-specific features outperform complex models in predictive accuracy.
Deep learning models like LSTM with attention mechanisms underperform in noisy financial data.
Complex signal extraction methods do not significantly improve prediction over basic feature engineering.
Abstract
The application of machine learning to financial prediction has accelerated dramatically, yet the conditions under which complex models outperform simple alternatives remain poorly understood. This paper investigates whether advanced signal processing and deep learning techniques can extract predictive value from investor order flows beyond what simple feature engineering achieves. Using a comprehensive dataset of 2.79 million observations spanning 2,439 Korean equities from 2020--2024, we apply three methodologies: \textit{Independent Component Analysis} (ICA) to recover latent market drivers, \textit{Wavelet Coherence} analysis to characterize multi-scale correlation structure, and \textit{Long Short-Term Memory} (LSTM) networks with attention mechanisms for non-linear prediction. Our results reveal a striking finding: a parsimonious linear model using market capitalization-normalized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis
