Unlocking the Value of Text: Event-Driven Reasoning and Multi-Level Alignment for Time Series Forecasting
Siyuan Wang, Peng Chen, Yihang Wang, Wanghui Qiu, Chenjuan Guo, Bin Yang, Yang Shu

TL;DR
This paper introduces VoT, a novel approach that leverages event-driven reasoning and multi-level alignment of textual and numerical data to significantly improve time series forecasting accuracy across diverse real-world datasets.
Contribution
VoT combines exogenous text with LLM reasoning and multi-level alignment techniques to enhance time series forecasting beyond traditional numerical methods.
Findings
Significant performance improvements over existing methods.
Effective integration of textual information with time series data.
Versatility demonstrated across 10 real-world domains.
Abstract
Existing time series forecasting methods primarily rely on the numerical data itself. However, real-world time series exhibit complex patterns associated with multimodal information, making them difficult to predict with numerical data alone. While several multimodal time series forecasting methods have emerged, they either utilize text with limited supplementary information or focus merely on representation extraction, extracting minimal textual information for forecasting. To unlock the Value of Text, we propose VoT, a method with Event-driven Reasoning and Multi-level Alignment. Event-driven Reasoning combines the rich information in exogenous text with the powerful reasoning capabilities of LLMs for time series forecasting. To guide the LLMs in effective reasoning, we propose the Historical In-context Learning that retrieves and applies historical examples as in-context guidance. To…
Peer Reviews
Decision·ICLR 2026 Poster
1. I like that the authors performed ablation on every module (ETA, AFF, HIC, Event branch), and appreciate the authors giving some insights rather than just presenting numbers. For instance, in section 4.3.1, the authors note that "removing HIC results in worse performance than removing the entire event-driven branch (w/o Event), suggesting that unguided LLM reasoning can be more detrimental than no reasoning at all.", which I think is a useful insight. 2. The pipeline of using LLMs to process
1. The main weakness of the paper stems from the risk of temporal leakage in HIC module: It is crucial that the knowledge base never indexes summaries/corrections from future windows relative to the test horizon. The paper should spell out strict time-based splits for KB construction and retrieval, but HIC description lacks an explicit leakage guarantee. 2. The final prediction of the model is a linear combination of band-specific frequency components $\mathcal{F}^b_{*}$ between the exogenous (
- Clear motivation that textual signals carry event-driven information complementary to numerical dynamics. - Technically coherent pipeline combining LLM-based reasoning and representation alignment.
- The Event-Driven Reasoning and Historical In-Context Learning (HIC) components are interesting, but it remains somewhat unclear how much of the improvement truly comes from the reasoning process itself. Could the authors clarify whether HIC’s benefit persists when compared to simpler retrieval or embedding-based baselines? Also, since the reasoning pipeline involves curated summaries and correction prompts, how robust is it under domain shift or when applied without instruction-tuned LLMs? - T
Originality: Proposes a dual-branch framework that combines Event-driven Reasoning from exogenous text with Multi-level Alignment (ETA for representation-level, AFF for prediction-level). The Historical In-Context Learning (HIC) idea—retrieving corrected past reasoning to guide current predictions—is a creative reuse of prior errors as in-context examples.
- The endogenous text appears to restate information already present in the series; the paper does not clearly justify why ETA (textualizing trend/seasonal components and aligning them) adds value beyond standard time–series decomposition and representation learning. - The computation of $Y_{\text{num}}$ is under-specified. It is unclear whether the forward pass at inference consumes only $H_{\text{ts}}$ or also the aligned textual features (e.g., $Z_{\mathrm{tr}}, Z_{\mathrm{se}}$). - The cons
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Forecasting Techniques and Applications · Topic Modeling
