Using Pre-trained LLMs for Multivariate Time Series Forecasting
Malcolm L. Wolff, Shenghao Yang, Kari Torkkola, Michael W. Mahoney

TL;DR
This paper explores leveraging pre-trained large language models for multivariate demand time series forecasting by developing novel embedding strategies, achieving competitive results with state-of-the-art models.
Contribution
It introduces a novel multivariate patching strategy to embed time series into LLMs, enabling transfer learning for forecasting tasks.
Findings
The patching strategy produces competitive forecasting results.
Embedding methods effectively transfer knowledge from LLMs to time series.
Weight-based diagnostics validate the approach.
Abstract
Pre-trained Large Language Models (LLMs) encapsulate large amounts of knowledge and take enormous amounts of compute to train. We make use of this resource, together with the observation that LLMs are able to transfer knowledge and performance from one domain or even modality to another seemingly-unrelated area, to help with multivariate demand time series forecasting. Attention in transformer-based methods requires something worth attending to -- more than just samples of a time-series. We explore different methods to map multivariate input time series into the LLM token embedding space. In particular, our novel multivariate patching strategy to embed time series features into decoder-only pre-trained Transformers produces results competitive with state-of-the-art time series forecasting models. We also use recently-developed weight-based diagnostics to validate our findings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Activation Patching
