Machine Learning vs Statistical Methods for Time Series Forecasting: Size Matters
Vitor Cerqueira, Luis Torgo, Carlos Soares

TL;DR
This paper demonstrates that machine learning methods outperform traditional statistical methods in time series forecasting as the sample size increases, countering previous claims that ML underperforms in this domain.
Contribution
The study shows that the relative performance of machine learning methods improves with larger sample sizes, highlighting the importance of data quantity in forecasting accuracy.
Findings
ML methods outperform statistical methods with larger samples
Performance gap widens as sample size increases
Previous results only hold at very low sample sizes
Abstract
Time series forecasting is one of the most active research topics. Machine learning methods have been increasingly adopted to solve these predictive tasks. However, in a recent work, these were shown to systematically present a lower predictive performance relative to simple statistical methods. In this work, we counter these results. We show that these are only valid under an extremely low sample size. Using a learning curve method, our results suggest that machine learning methods improve their relative predictive performance as the sample size grows. The code to reproduce the experiments is available at https://github.com/vcerqueira/MLforForecasting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Forecasting Techniques and Applications · Stock Market Forecasting Methods
