LLM Pretraining Shapes a Generalizable Manifold: Insights into Cross-Modal Transfer to Time Series

Alexis Roger; Prateek Humane; Zhenghan Tai; Gwen Legate; Andrei Mircea; Vasilii Feofanov; Irina Rish

arXiv:2605.20449·cs.LG·May 21, 2026

LLM Pretraining Shapes a Generalizable Manifold: Insights into Cross-Modal Transfer to Time Series

Alexis Roger, Prateek Humane, Zhenghan Tai, Gwen Legate, Andrei Mircea, Vasilii Feofanov, Irina Rish

PDF

TL;DR

This paper demonstrates that language-pretrained transformers develop a reusable manifold that facilitates effective cross-modal transfer to time-series forecasting, highlighting the geometric basis of transfer learning.

Contribution

It reveals that language pretraining creates a manifold enabling time-series prediction without paired data, and finetuning aligns this manifold for specific tasks.

Findings

01

Linear probes decode realistic time-series trajectories from frozen LLM states.

02

Retrieval in the projected space yields competitive time-series forecasts.

03

Pretrained initialization improves optimization and produces a highly anisotropic loss landscape.

Abstract

Can language-pretrained transformers become effective time-series forecasters, and why? In this paper, we show that cross-modal transfer arises because language pretraining preconditions time series training with a reusable manifold. A linear probe on frozen LLM states decodes realistic time-series trajectories without paired supervision, and retrieval in this projected space yields competitive forecasts, showing that structure and dynamics exist before finetuning. Pretrained initialization also improves optimization, producing coherent gradients and a highly anisotropic loss landscape unlike random initialization. Finetuning then acts as low-dimensional alignment, reusing existing directions rather than learning temporal primitives from scratch, as evidenced by low-rank updates, subspace alignment, and shared features for periodicity, trend, and repetition. Together, these results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.