LaSTR: Language-Driven Time-Series Segment Retrieval

Kota Dohi; Harsh Purohit; Tomoya Nishida; Takashi Endo; Yusuke Ohtsubo; Koichiro Yawata; Koki Takeshita; Tatsuya Sasaki; Yohei Kawaguchi

arXiv:2603.00725·cs.CL·March 3, 2026

LaSTR: Language-Driven Time-Series Segment Retrieval

Kota Dohi, Harsh Purohit, Tomoya Nishida, Takashi Endo, Yusuke Ohtsubo, Koichiro Yawata, Koki Takeshita, Tatsuya Sasaki, Yohei Kawaguchi

PDF

Open Access

TL;DR

LaSTR introduces a language-driven approach for retrieving relevant time-series segments using large-scale caption data and a contrastive retriever, outperforming existing methods in semantic accuracy.

Contribution

The paper presents a novel framework that leverages GPT-5.2 generated descriptions and a Conformer-based retriever for effective natural language-based time-series segment retrieval.

Findings

01

Outperforms random and CLIP baselines in retrieval quality

02

Achieves stronger semantic alignment between queries and retrieved segments

03

Demonstrates effectiveness across various candidate pool sizes

Abstract

Effectively searching time-series data is essential for system analysis, but existing methods often require expert-designed similarity criteria or rely on global, series-level descriptions. We study language-driven segment retrieval: given a natural language query, the goal is to retrieve relevant local segments from large time-series repositories. We build large-scale segment--caption training data by applying TV2-based segmentation to LOTSA windows and generating segment descriptions with GPT-5.2, and then train a Conformer-based contrastive retriever in a shared text--time-series embedding space. On a held-out test split, we evaluate single-positive retrieval together with caption-side consistency (SBERT and VLM-as-a-judge) under multiple candidate pool sizes. Across all settings, LaSTR outperforms random and CLIP baselines, yielding improved ranking quality and stronger semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Multimodal Machine Learning Applications · Topic Modeling