PaEBack: Pareto-Efficient Backsubsampling for Time Series Data

Xinyu Zhang; Sujit Ghosh

arXiv:2210.15780·stat.AP·October 2, 2023

PaEBack: Pareto-Efficient Backsubsampling for Time Series Data

Xinyu Zhang, Sujit Ghosh

PDF

Open Access

TL;DR

This paper introduces PaEBack, a method to determine the minimal recent data needed for near-optimal forecasting accuracy in time series, supported by theoretical and numerical evidence.

Contribution

The paper proposes PaEBack, a novel approach to estimate the recent data fraction required for effective time series prediction, with theoretical justification for AR models.

Findings

01

A small recent data subset can achieve near-optimal prediction accuracy.

02

PaEBack applies effectively even with model misspecification.

03

The method is supported by theoretical and numerical validation.

Abstract

Time series forecasting has been a quintessential topic in data science, but traditionally, forecasting models have relied on extensive historical data. In this paper, we address a practical question: How much recent historical data is required to attain a targeted percentage of statistical prediction efficiency compared to the full time series? We propose the Pareto-Efficient Backsubsampling (PaEBack) method to estimate the percentage of the most recent data needed to achieve the desired level of prediction accuracy. We provide a theoretical justification based on asymptotic prediction theory for the AutoRegressive (AR) models. In particular, through several numerical illustrations, we show the application of the PaEBack for some recently developed machine learning forecasting methods even when the models might be misspecified. The main conclusion is that only a fraction of the most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Neural Networks and Applications · Stock Market Forecasting Methods