Thoth: Mid-Training Bridges LLMs to Time Series Understanding
Jiafeng Lin, Yuxuan Wang, Jialong Wu, Huakun Luo, Zhongyi Pei, Jianmin Wang

TL;DR
Thoth introduces a novel mid-training approach for large language models to understand and reason about time series data, significantly improving their performance on time series tasks and benchmarks.
Contribution
This work presents Thoth, the first family of mid-trained LLMs with general-purpose time series understanding, using a new corpus and benchmark for evaluation.
Findings
Thoth outperforms base models and advanced LLMs on time series question answering.
Mid-training with Book-of-Thoth enhances temporal reasoning capabilities.
Thoth performs well under data scarcity when fine-tuned.
Abstract
Large Language Models (LLMs) have demonstrated remarkable success in general-purpose reasoning. However, they still struggle to understand and reason about time series data, which limits their effectiveness in decision-making scenarios that depend on temporal dynamics. In this paper, we propose Thoth, the first family of mid-trained LLMs with general-purpose time series understanding capabilities. As a pivotal intermediate stage, mid-training achieves task- and domain-agnostic alignment between time series and natural language, for which we construct Book-of-Thoth, a high-quality, time-series-centric mid-training corpus. Book-of-Thoth enables both time-series-to-text and text-to-time-series generation, equipping LLMs with a foundational grasp of temporal patterns. To better evaluate advanced reasoning capabilities, we further present KnoTS, a novel benchmark of knowledge-intensive time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Topic Modeling · Multimodal Machine Learning Applications
