# Estimating variances in time series linear regression models using   empirical BLUPs and convex optimization

**Authors:** Martina Han\v{c}ov\'a, Gabriela Voz\'arikov\'a, Andrej Gajdo\v{s},, Jozef Han\v{c}

arXiv: 1905.07771 · 2020-03-10

## TL;DR

This paper introduces a novel two-stage method for estimating variance components in time series linear mixed models using empirical BLUPs and convex optimization, improving computational efficiency and theoretical understanding.

## Contribution

It develops a new estimation approach for variance components in FDSLRMs, establishing theoretical existence, equivalence of estimators, and a fast, accurate algorithm for maximum likelihood estimation.

## Key findings

- The method provides invariant, non-negative quadratic estimators applicable to any absolutely continuous distribution.
- A new $	ext{O}(n)$ algorithm for (RE)MLE computation is 10^7 times more accurate and 100 times faster than existing packages.
- Validated on real datasets: electricity, tourism, and cybersecurity, demonstrating practical effectiveness.

## Abstract

We propose a two-stage estimation method of variance components in time series models known as FDSLRMs, whose observations can be described by a linear mixed model (LMM). We based estimating variances, fundamental quantities in a time series forecasting approach called kriging, on the empirical (plug-in) best linear unbiased predictions of unobservable random components in FDSLRM.   The method, providing invariant non-negative quadratic estimators, can be used for any absolutely continuous probability distribution of time series data. As a result of applying the convex optimization and the LMM methodology, we resolved two problems $-$ theoretical existence and equivalence between least squares estimators, non-negative (M)DOOLSE, and maximum likelihood estimators, (RE)MLE, as possible starting points of our method and a practical lack of computational implementation for FDSLRM. As for computing (RE)MLE in the case of $ n $ observed time series values, we also discovered a new algorithm of order $\mathcal{O}(n)$, which at the default precision is $10^7$ times more accurate and $n^2$ times faster than the best current Python(or R)-based computational packages, namely CVXPY, CVXR, nlme, sommer and mixed.   We illustrate our results on three real data sets $-$ electricity consumption, tourism and cyber security $-$ which are easily available, reproducible, sharable and modifiable in the form of interactive Jupyter notebooks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.07771/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1905.07771/full.md

## References

66 references — full list in the complete paper: https://tomesphere.com/paper/1905.07771/full.md

---
Source: https://tomesphere.com/paper/1905.07771