DRTA: Dynamic Reward Scaling for Reinforcement Learning in Time Series Anomaly Detection
Bahareh Golchin, Banafsheh Rekabdar, Kunpeng Liu

TL;DR
This paper introduces DRTA, a reinforcement learning framework with dynamic reward scaling, combining VAE and active learning to improve time series anomaly detection, especially in low-label scenarios.
Contribution
The paper presents a novel RL-based approach with adaptive reward shaping that enhances anomaly detection accuracy and generalization in limited-label environments.
Findings
Outperforms state-of-the-art methods on Yahoo benchmarks
Effective in low-label anomaly detection scenarios
Maintains high precision and recall
Abstract
Anomaly detection in time series data is important for applications in finance, healthcare, sensor networks, and industrial monitoring. Traditional methods usually struggle with limited labeled data, high false-positive rates, and difficulty generalizing to novel anomaly types. To overcome these challenges, we propose a reinforcement learning-based framework that integrates dynamic reward shaping, Variational Autoencoder (VAE), and active learning, called DRTA. Our method uses an adaptive reward mechanism that balances exploration and exploitation by dynamically scaling the effect of VAE-based reconstruction error and classification rewards. This approach enables the agent to detect anomalies effectively in low-label systems while maintaining high precision and recall. Our experimental results on the Yahoo A1 and Yahoo A2 benchmark datasets demonstrate that the proposed method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
