Variable Selection for Comparing High-dimensional Time-Series Data
Kensuke Mitsuzawa, Margherita Grossi, Stefano Bortoli, Motonobu, Kanagawa

TL;DR
This paper introduces a method for variable and time interval selection to compare high-dimensional time-series data, aiding in simulator validation and comparison, especially when dealing with expensive simulations.
Contribution
The paper proposes a novel approach that segments time series, selects significant variables, and performs two-sample tests to identify differences, with validation on synthetic and real-world simulation data.
Findings
Effective in identifying significant variables and intervals
Demonstrated on fluid and traffic simulators
Provides a tool for simulator validation and comparison
Abstract
Given a pair of multivariate time-series data of the same length and dimensions, an approach is proposed to select variables and time intervals where the two series are significantly different. In applications where one time series is an output from a computationally expensive simulator, the approach may be used for validating the simulator against real data, for comparing the outputs of two simulators, and for validating a machine learning-based emulator against the simulator. With the proposed approach, the entire time interval is split into multiple subintervals, and on each subinterval, the two sample sets are compared to select variables that distinguish their distributions and a two-sample test is performed. The validity and limitations of the proposed approach are investigated in synthetic data experiments. Its usefulness is demonstrated in an application with a particle-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting
