Differentially Encoded Observation Spaces for Perceptive Reinforcement Learning
Lev Grossman, Brian Plancher

TL;DR
This paper introduces differentially encoded observation spaces for perceptive deep reinforcement learning, significantly reducing memory usage and latency, enabling more efficient training and deployment on edge devices.
Contribution
It proposes a novel lossless differential video encoding scheme for compressing image-based replay buffers in DRL, improving efficiency without sacrificing performance.
Findings
Memory footprint reduced by up to 14.2x and 16.7x.
Latency improved by up to 32%.
Effective across Atari and DeepMind Control Suite tasks.
Abstract
Perceptive deep reinforcement learning (DRL) has lead to many recent breakthroughs for complex AI systems leveraging image-based input data. Applications of these results range from super-human level video game agents to dexterous, physically intelligent robots. However, training these perceptive DRL-enabled systems remains incredibly compute and memory intensive, often requiring huge training datasets and large experience replay buffers. This poses a challenge for the next generation of field robots that will need to be able to learn on the edge in order to adapt to their environments. In this paper, we begin to address this issue through differentially encoded observation spaces. By reinterpreting stored image-based observations as a video, we leverage lossless differential video encoding schemes to compress the replay buffer without impacting training performance. We evaluate our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Neural Networks and Reservoir Computing
MethodsExperience Replay
