evMLP: An Efficient Event-Driven MLP Architecture for Vision
Zhentan Zheng

TL;DR
evMLP introduces an event-driven MLP architecture for vision tasks that selectively updates image patches based on changes between frames, enhancing computational efficiency while maintaining competitive accuracy.
Contribution
The paper proposes evMLP with an event-driven local update mechanism, enabling efficient processing of sequential visual data by focusing on changed regions.
Findings
Achieves competitive ImageNet classification accuracy.
Reduces computational cost on video datasets.
Maintains output consistency with non-event-driven models.
Abstract
Deep neural networks have achieved remarkable results in computer vision tasks. In the early days, Convolutional Neural Networks (CNNs) were the mainstream architecture. In recent years, Vision Transformers (ViTs) have become increasingly popular. In addition, exploring applications of multi-layer perceptrons (MLPs) has provided new perspectives for research into vision model architectures. In this paper, we present evMLP accompanied by a simple event-driven local update mechanism. The proposed evMLP can independently process patches on images or feature maps via MLPs. We define changes between consecutive frames as ``events''. Under the event-driven local update mechanism, evMLP selectively processes patches where events occur. For sequential image data (e.g., video processing), this approach improves computational performance by avoiding redundant computations. Through ImageNet image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Advanced Memory and Neural Computing
