Querying Labeled Time Series Data with Scenario Programs
Devan Shanker

TL;DR
This paper introduces a formal method and algorithm for matching real-world labeled time series data with simulated scenarios to validate autonomous vehicle safety and ensure simulation failures align with real-world outcomes.
Contribution
It provides a novel formal definition of scenario-data matching and develops a scalable querying algorithm applicable to autonomous vehicles and other cyber-physical systems.
Findings
High precision in matching challenging time series scenarios
Algorithm demonstrates scalability on large datasets
Applicable to diverse cyber-physical systems
Abstract
In order to ensure autonomous vehicles are safe for on-road deployment, simulation-based testing has become an integral complement to on-road testing. The rise in simulation testing and validation reflects a growing need to verify that AV behavior is consistent with desired outcomes even in edge case scenarios which may seldom or never appear in on-road testing data. This raises a critical question: to what extent are AV failures in simulation consistent with data collected from real-world testing? As a result of the gap between simulated and real sensor data (sim-to-real gap), failures in simulation can either be spurious (simulation- or simulator-specific issues) or relevant (safety-critical AV system issues). One possible method for validating if simulated time series failures are consistent with real world time series sensor data could involve retrieving instances of the failure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Data Mining Algorithms and Applications
MethodsSparse Evolutionary Training
