Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data
Zhaoxing Gao, Ruey S. Tsay

TL;DR
This paper introduces a hierarchical distributed factor analysis method for large-scale, high-dimensional time series data, enabling efficient modeling and forecasting across multiple computing nodes.
Contribution
The paper proposes a novel multi-level PCA-based hierarchical approach for distributed factor modeling of large-scale time series data, extending theoretical properties and forecasting methods.
Findings
Effective dimension reduction in large-scale data
Improved forecasting accuracy over existing methods
Theoretical guarantees for the hierarchical approach
Abstract
This paper proposes a hierarchical approximate-factor approach to analyzing high-dimensional, large-scale heterogeneous time series data using distributed computing. The new method employs a multiple-fold dimension reduction procedure using Principal Component Analysis (PCA) and shows great promises for modeling large-scale data that cannot be stored nor analyzed by a single machine. Each computer at the basic level performs a PCA to extract common factors among the time series assigned to it and transfers those factors to one and only one node of the second level. Each 2nd-level computer collects the common factors from its subordinates and performs another PCA to select the 2nd-level common factors. This process is repeated until the central server is reached, which collects common factors from its direct subordinates and performs a final PCA to select the global common factors. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Complex Systems and Time Series Analysis · Image and Signal Denoising Methods
