CIPHER: Scalable Time Series Analysis for Physical Sciences with Application to Solar Wind Phenomena
Jasmine R. Kobayashi, Daniela Martin, Valmir P Moraes Filho, Connor O'Brien, Jinsu Hong, Sudeshna Boro Saikia, Hala Lamdouar, Nathan D. Miles, Marcella Scoczynski, Mavis Stone, Sairam Sundaresan, Anna Jungbluth, Andr\'es Mu\~noz-Jaramillo, Evangelia Samara, Joseph Gallego

TL;DR
CIPHER is a scalable framework that combines symbolic compression, clustering, and human validation to efficiently label complex physical science time series, demonstrated on solar wind data.
Contribution
It introduces a novel pipeline integrating symbolic approximation, clustering, and expert validation for large-scale time series labeling in physics.
Findings
Successfully classifies solar wind phenomena like CMEs and stream regions.
Demonstrates scalable, systematic labeling with minimal expert effort.
Provides a reproducible framework applicable to physical sciences data.
Abstract
Labeling or classifying time series is a persistent challenge in the physical sciences, where expert annotations are scarce, costly, and often inconsistent. Yet robust labeling is essential to enable machine learning models for understanding, prediction, and forecasting. We present the \textit{Clustering and Indexation Pipeline with Human Evaluation for Recognition} (CIPHER), a framework designed to accelerate large-scale labeling of complex time series in physics. CIPHER integrates \textit{indexable Symbolic Aggregate approXimation} (iSAX) for interpretable compression and indexing, density-based clustering (HDBSCAN) to group recurring phenomena, and a human-in-the-loop step for efficient expert validation. Representative samples are labeled by domain scientists, and these annotations are propagated across clusters to yield systematic, scalable classifications. We evaluate CIPHER on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Complex Systems and Time Series Analysis · Statistical Mechanics and Entropy
