Using Machine Learning and Alternative Data to Predict Movements in Market Risk
Thomas Dierckx, Jesse Davis, Wim Schoutens

TL;DR
This paper explores the use of machine learning and alternative data sources like Google News and Wikipedia traffic to predict market implied volatility, a key financial indicator, revealing potential non-linear relationships.
Contribution
It is the first study to predict market implied volatility using alternative data and machine learning, highlighting non-linear feature relationships.
Findings
Market implied volatility can be predicted with machine learning.
Alternative data did not significantly improve accuracy.
Preliminary evidence of non-linear relationships between Wikipedia traffic and volatility.
Abstract
Using machine learning and alternative data for the prediction of financial markets has been a popular topic in recent years. Many financial variables such as stock price, historical volatility and trade volume have already been through extensive investigation. Remarkably, we found no existing research on the prediction of an asset's market implied volatility within this context. This forward-looking measure gauges the sentiment on the future volatility of an asset, and is deemed one of the most important parameters in the world of derivatives. The ability to predict this statistic may therefore provide a competitive edge to practitioners of market making and asset management alike. Consequently, in this paper we investigate Google News statistics and Wikipedia site traffic as alternative data sources to quantitative market data and consider Logistic Regression, Support Vector Machines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Financial Markets and Investment Strategies · Advanced Text Analysis Techniques
MethodsLogistic Regression
