AegisTS: A Hierarchical Agent System with Reinforcement Learning for Multivariate Time Series Data Cleaning
Yuhan Shi, Yuanyuan Yao, Lu Chen, Mourad Khayati, Tianyi Li

TL;DR
AegisTS is a hierarchical reinforcement learning agent system that optimizes multivariate time series data cleaning by jointly addressing multiple quality issues without needing ground truth.
Contribution
Introduces a hierarchical agent architecture with dual-stage reward mechanism for simultaneous multi-issue data cleaning in MTS without ground truth.
Findings
Achieves up to 96% improvement in data cleaning quality.
Attains 27% better downstream performance.
Outperforms existing data cleaning methods.
Abstract
Multivariate time series (MTS) are frequently affected by co-occurring quality issues, such as missing values, outliers, and constraint violations, which significantly undermine downstream analytics. Existing cleaning approaches fix only a limited set of such issues, making them ill-suited for scenarios where multiple quality problems arise simultaneously. Furthermore, these methods commonly depend on the availability of ground truth data or domain-specific rules, both of which are rarely accessible in real-world applications. In this paper, we introduce \sys, an agent system with reinforcement learning designed to clean multiple data quality issues in MTS. We cast the cleaning process as a joint optimization problem that simultaneously handles quality issue order and cleaning model selection, allowing efficient navigation of the large space of possible cleaning pipelines. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
