News Sentiment Embeddings for Stock Price Forecasting
Ayaan Qayyum

TL;DR
This study demonstrates that using news headline embeddings significantly improves stock price forecasting accuracy for the SPY ETF, leveraging OpenAI models, PCA, and economic indicators.
Contribution
It introduces a novel approach combining news embeddings, PCA, and economic data to enhance stock prediction models, achieving substantial accuracy gains.
Findings
Headline embeddings improve prediction accuracy by at least 40%.
Over 390 machine learning models were trained and evaluated.
News data captures nuanced market impacts effectively.
Abstract
This paper will discuss how headline data can be used to predict stock prices. The stock price in question is the SPDR S&P 500 ETF Trust, also known as SPY that tracks the performance of the largest 500 publicly traded corporations in the United States. A key focus is to use news headlines from the Wall Street Journal (WSJ) to predict the movement of stock prices on a daily timescale with OpenAI-based text embedding models used to create vector encodings of each headline with principal component analysis (PCA) to exact the key features. The challenge of this work is to capture the time-dependent and time-independent, nuanced impacts of news on stock prices while handling potential lag effects and market noise. Financial and economic data were collected to improve model performance; such sources include the U.S. Dollar Index (DXY) and Treasury Interest Yields. Over 390 machine-learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
