Fast Retinomorphic Event Stream for Video Recognition and Reinforcement Learning
Wanjia Liu, Huaijin Chen, Rishab Goel, Yuzhong Huang, Ashok, Veeraraghavan, Ankit Patel

TL;DR
This paper introduces a fast, biologically inspired event-driven representation (EDR) for video processing that enables real-time inference and learning, significantly improving speed over optical flow-based methods while maintaining high accuracy.
Contribution
The paper proposes a novel EDR model inspired by early retinal circuits, enabling fast, real-time video analysis suitable for reinforcement learning and recognition tasks.
Findings
EDR achieves over 9000 fps for real-time inference.
Demonstrates improved RL performance on Atari games.
Maintains near state-of-the-art accuracy on UCF-101 with 1500x faster processing.
Abstract
Good temporal representations are crucial for video understanding, and the state-of-the-art video recognition framework is based on two-stream networks. In such framework, besides the regular ConvNets responsible for RGB frame inputs, a second network is introduced to handle the temporal representation, usually the optical flow (OF). However, OF or other task-oriented flow is computationally costly, and is thus typically pre-computed. Critically, this prevents the two-stream approach from being applied to reinforcement learning (RL) applications such as video game playing, where the next state depends on current state and action choices. Inspired by the early vision systems of mammals and insects, we propose a fast event-driven representation (EDR) that models several major properties of early retinal circuits: (1) logarithmic input response, (2) multi-timescale temporal smoothing to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsZebrafish Biomedical Research Applications · Advanced Neural Network Applications · Advanced Memory and Neural Computing
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
