Squeezing More from the Stream : Learning Representation Online for Streaming Reinforcement Learning
Nilaksh, Antoine Clavaud, Mathieu Reymond, Fran\c{c}ois Rivest, Sarath Chandar

TL;DR
This paper introduces a novel method for streaming reinforcement learning that enhances representation learning from limited, transient data by extending self-predictive representations and addressing training instabilities, leading to improved performance and richer representations.
Contribution
It extends Self-Predictive Representations to streaming RL, introduces orthogonal gradient updates to stabilize training, and demonstrates improved performance and richer representations without replay buffers.
Findings
Outperforms existing streaming RL baselines on multiple benchmarks.
Learns significantly richer and more meaningful representations.
Remains computationally efficient, training on few CPU cores.
Abstract
In streaming Reinforcement Learning (RL), transitions are observed and discarded immediately after a single update. While this minimizes resource usage for on-device applications, it makes agents notoriously sample-inefficient, since value-based losses alone struggle to extract meaningful representations from transient data. We propose extending Self-Predictive Representations (SPR) to the streaming pipeline to maximize the utility of every observed frame. However, due to the highly correlated samples induced by the streaming regime, naively applying this auxiliary loss results in training instabilities. Thus, we introduce orthogonal gradient updates relative to the momentum target and resolve gradient conflicts arising from streaming-specific optimizers. Validated across the Atari, MinAtar, and Octax suites, our approach systematically outperforms existing streaming baselines.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Adversarial Robustness in Machine Learning
