TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning
Junru Zhang, Lang Feng, Xu Guo, Yuhan Wu, Yabo Dong, Duanqing Xu

TL;DR
TimeMaster introduces a reinforcement learning approach for multimodal large language models to perform structured, interpretable reasoning over visualized time-series data, achieving state-of-the-art results on real-world classification tasks.
Contribution
It presents a novel RL-based training method with a structured output format for time-series reasoning in multimodal LLMs, improving performance and interpretability.
Findings
Outperforms classical models and GPT-4o on TimerBed benchmark.
Achieves over 14.6% and 7.3% performance gains.
Generates explanations and domain insights.
Abstract
Time-series reasoning remains a significant challenge in multimodal large language models (MLLMs) due to the dynamic temporal patterns, ambiguous semantics, and lack of temporal priors. In this work, we introduce TimeMaster, a reinforcement learning (RL)-based method that enables time-series MLLMs to perform structured, interpretable reasoning directly over visualized time-series inputs and task prompts. TimeMaster adopts a three-part structured output format, reasoning, classification, and domain-specific extension, and is optimized via a composite reward function that aligns format adherence, prediction accuracy, and open-ended insight quality. The model is trained using a two-stage pipeline: we first apply supervised fine-tuning (SFT) to establish a good initialization, followed by Group Relative Policy Optimization (GRPO) at the token level to enable stable and targeted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Time Series Analysis and Forecasting
