Generating long-horizon stock "buy" signals with a neural language model
Joel R. Bock

TL;DR
This study fine-tunes a small language model to predict long-term stock buy signals from 10-K reports, achieving notable out-of-sample accuracy at 6-9 months, with implications for sector-specific reporting styles.
Contribution
It introduces a method for forecasting long-horizon stock movements using narrative text from financial reports with improved accuracy over random models.
Findings
Buy signals are most precise at 6 and 9 months.
F1 score of 0.62 indicates good out-of-sample performance.
Buy signals outperform random selection by 4.8-9%."
Abstract
This paper describes experiments on fine-tuning a small language model to generate forecasts of long-horizon stock price movements. Inputs to the model are narrative text from 10-K reports of large market capitalization companies in the S&P 500 index; the output is a forward-looking buy or sell decision. Price direction is predicted at discrete horizons up to 12 months after the report filing date. The results reported here demonstrate good out-of-sample statistical performance (F1-macro= 0.62) at medium to long investment horizons. In particular, the buy signals generated from 10-K text are found most precise at 6 and 9 months in the future. As measured by the F1 score, the buy signal provides between 4.8 and 9 percent improvement against a random stock selection model. In contrast, sell signals generated by the models do not perform well. This may be attributed to the highly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods
