R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous Robotics
Zexin Li, Aritra Samanta, Yufei Li, Andrea Soltoggio, Hyoseung Kim and, Cong Liu

TL;DR
This paper introduces R^3, a comprehensive system for managing timing, memory, and performance trade-offs in on-device real-time deep reinforcement learning for autonomous robots, ensuring efficient adaptation in dynamic environments.
Contribution
R^3 is a holistic solution that dynamically balances timing, memory, and algorithm performance in on-device DRL training through feedback loops, memory management, and runtime coordination.
Findings
R^3 maintains consistent latency and timing predictability across hardware platforms.
It reduces memory footprint, enabling larger replay buffers and improved learning.
R^3 demonstrates effectiveness in autonomous vehicle simulation environments.
Abstract
Autonomous robotic systems, like autonomous vehicles and robotic search and rescue, require efficient on-device training for continuous adaptation of Deep Reinforcement Learning (DRL) models in dynamic environments. This research is fundamentally motivated by the need to understand and address the challenges of on-device real-time DRL, which involves balancing timing and algorithm performance under memory constraints, as exposed through our extensive empirical studies. This intricate balance requires co-optimizing two pivotal parameters of DRL training -- batch size and replay buffer size. Configuring these parameters significantly affects timing and algorithm performance, while both (unfortunately) require substantial memory allocation to achieve near-optimal performance. This paper presents R^3, a holistic solution for managing timing, memory, and algorithm performance in on-device…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReal-Time Systems Scheduling · Age of Information Optimization · Distributed systems and fault tolerance
