Insight Miner: A Time Series Analysis Dataset for Cross-Domain Alignment with Natural Language
Yunkai Zhang, Yawen Zhang, Ming Zheng, Kezhen Chen, Chongyang Gao, Ruian Ge, Siyuan Teng, Amine Jelloul, Jinmeng Rao, Xiaoyuan Guo, Chiang-Wei Fang, Zeyu Zheng, Jie Yang

TL;DR
Insight Miner introduces a large-scale dataset and a multimodal model that generate detailed, domain-enriched descriptions of time series data, advancing automatic insight extraction across various scientific fields.
Contribution
The paper presents TS-Insights, the first dataset for time series and language alignment, and a multimodal model trained on it, outperforming existing models in time series description tasks.
Findings
Insight Miner surpasses state-of-the-art models in time series description.
TS-Insights enables effective training of multimodal models for time series analysis.
The approach demonstrates potential for LLMs to interpret time series as a native modality.
Abstract
Time-series data is critical across many scientific and industrial domains, including environmental analysis, agriculture, transportation, and finance. However, mining insights from this data typically requires deep domain expertise, a process that is both time-consuming and labor-intensive. In this paper, we propose \textbf{Insight Miner}, a large-scale multimodal model (LMM) designed to generate high-quality, comprehensive time-series descriptions enriched with domain-specific knowledge. To facilitate this, we introduce \textbf{TS-Insights}\footnote{Available at \href{https://huggingface.co/datasets/zhykoties/time-series-language-alignment}{https://huggingface.co/datasets/zhykoties/time-series-language-alignment}.}, the first general-domain dataset for time series and language alignment. TS-Insights contains 100k time-series windows sampled from 20 forecasting datasets. We construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Machine Learning in Healthcare · Forecasting Techniques and Applications
