TimeSynth: A Framework for Uncovering Systematic Biases in Time Series Forecasting
Md Rakibul Haque, Vishwa Goudar, Shireen Elhabian, Warren Woodrich Pettine

TL;DR
TimeSynth is a framework that systematically evaluates time series forecasting models, revealing biases in linear models and advantages of nonlinear models like CNNs and Transformers, especially with complex signals.
Contribution
It introduces a structured, realistic benchmarking framework that isolates model performance factors and challenges prior claims of linear model dominance in time series forecasting.
Findings
Linear models tend to collapse to simple oscillations regardless of complexity.
Nonlinear models, especially CNNs and Transformers, outperform linear models on complex signals.
The framework assesses robustness under distribution and noise shifts.
Abstract
Time series forecasting is a fundamental tool with wide ranging applications, yet recent debates question whether complex nonlinear architectures truly outperform simple linear models. Prior claims of dominance of the linear model often stem from benchmarks that lack diverse temporal dynamics and employ biased evaluation protocols. We revisit this debate through TimeSynth, a structured framework that emulates key properties of real world time series,including non-stationarity, periodicity, trends, and phase modulation by creating synthesized signals whose parameters are derived from real-world time series. Evaluating four model families Linear, Multi Layer Perceptrons (MLP), Convolutional Neural Networks (CNNs), and Transformers, we find a systematic bias in linear models: they collapse to simple oscillation regardless of signal complexity. Nonlinear models avoid this collapse and gain…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
- The work directly engages with a critical, unresolved debate in the field. By shifting the focus from which model is "best" to why models fail, it provides a much-needed mechanistic explanation for performance discrepancies. - The TimeSynth framework uses parameters derived from real-world physiological signals (PPG, ECG) rather than arbitrary ones. Its design of three signal families (Drift, SPM, DPM) allows for a controlled assessment of model capabilities against specific dynamic propertie
- The TimeSynth framework is parameterized exclusively from BVP (PPG) and ECG signals. Both are physiological time series known for their strong, quasi-periodic, harmonic nature. The paper's framing and conclusions, however, speak to "real-world time series" in general, which include domains like finance, energy, and climate that are dominated by very different dynamics (e.g., stochastic trends, chaotic behavior, sharp, non-periodic shocks). The current framework, being entirely harmonic-based,
The three signal families isolate non-stationarity, modulation, and multi-component structure with parameters sourced from real datasets, enabling targeted stress tests rather than ad-hoc toy signals. Adding frequency and phase fidelity provides a more faithful view of oscillatory forecasting than amplitude-only scores; plots (horizon-wise) help interpret failure modes. Clean vs. noise (SNR 40/30/20 dB) vs. distribution shift (five disjoint frequency ranges) show when models truly generalize vs.
1. The framework restricts to univariate forecasting; many real applications hinge on multivariate dependencies (cross-channel/lag interactions). Results and claims may not transfer. 2. “Bounded neural fitting” for deriving parameter distributions (PPG-Dalia, MIT-BIH) lacks details (objective, bounds, priors, initialization, regularization, failure handling), making replication and bias assessment difficult. 3. The linear family’s “collapse” may partly reflect optimization/regularization/normali
**S1** The paper introduces a synthetic benchmark (TimeSynth) whose parameterization is carefully grounded in fits to real-world physiological time series, providing a rigorous and controlled testbed for forecasting method evaluation. The drift, phase modulation, and dual-modulation primitives are well-motivated and visually illustrated in Figure 1 and Appendix figures (e.g., Figures 9–11). **S2** Evaluation is more holistic than usual, spanning amplitude (MAE/MSE), frequency alignment, and pha
**W1** Limited Novelty of Core Idea: While the framework is more principled than prior synthetic benchmarks, the design is essentially an aggregator of improved synthetic signal families and systematic model evaluation. The creation of synthetic time series for benchmarking and the classic “linear vs. nonlinear” debate are both very well-trodden. The novelty is more in the comprehensive execution and the systematic bias claim, but less in core methodological innovation. **W2** Framework Coverag
Principled synthetic design: Parameters are fit from real data rather than hand-picked; families capture trend, drift, single/dual modulation, and controlled frequency shifts. Richer metrics: Frequency and phase errors (plus amplitude) provide a more faithful assessment of oscillatory fidelity than MAE/MSE alone. Clear empirical finding: Well-documented linear collapse across families and conditions; nonlinear models show superior phase and frequency tracking. Statistical rigor: Mixed-effects
Domain narrowness of parameter derivation. Real-parameter fitting is taken from physiological signals (PPG-DaLiA, MIT-BIH ECG). Claims about general time-series forecasting would be stronger if parameters were also derived from non-physio domains (load, economics, climate). Univariate only. TimeSynth is univariate; many real settings are multivariate with cross-lag structure. Conclusions may not carry over. Evaluation choices need tightening for DPM. Peak-frequency error is ambiguous for dual-
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForecasting Techniques and Applications · Time Series Analysis and Forecasting · Stock Market Forecasting Methods
