Generalisation in fully-connected neural networks for time series forecasting
Anastasia Borovykh, Cornelis W. Oosterlee, Sander M. Bohte

TL;DR
This paper investigates how fully-connected neural networks generalize in time series forecasting, using Hessian-based metrics to quantify and control generalization ability through training hyperparameters.
Contribution
It introduces Hessian-based measures for assessing generalization in time series neural networks and demonstrates how training hyperparameters influence model complexity and generalization.
Findings
Hessian metrics effectively quantify generalization in time series forecasting.
Training hyperparameters like learning rate and batch size control model complexity.
Empirical validation of Hessian-based generalization measures in non-i.i.d. data settings.
Abstract
In this paper we study the generalization capabilities of fully-connected neural networks trained in the context of time series forecasting. Time series do not satisfy the typical assumption in statistical learning theory of the data being i.i.d. samples from some data-generating distribution. We use the input and weight Hessians, that is the smoothness of the learned function with respect to the input and the width of the minimum in weight space, to quantify a network's ability to generalize to unseen data. While such generalization metrics have been studied extensively in the i.i.d. setting of for example image recognition, here we empirically validate their use in the task of time series forecasting. Furthermore we discuss how one can control the generalization capability of the network by means of the training process using the learning rate, batch size and the number of training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference · Time Series Analysis and Forecasting
