Don't Think It Twice: Exploit Shift Invariance for Efficient Online Streaming Inference of CNNs
Christodoulos Kechris, Jonathan Dan, Jose Miranda, David Atienza

TL;DR
This paper introduces StreamiNNC, a method that exploits shift-invariance in CNNs to significantly reduce computational costs during online streaming inference, especially in biomedical signal processing, by skipping redundant calculations.
Contribution
It proposes a novel strategy for efficient streaming CNN inference that accounts for zero-padding and pooling effects, with theoretical error bounds and practical validation.
Findings
StreamiNNC reduces inference computation by exploiting shift-invariance.
The method achieves 2.03-3.55% NRMSE deviation from normal inference.
Validation on biomedical data demonstrates practical efficiency gains.
Abstract
Deep learning time-series processing often relies on convolutional neural networks with overlapping windows. This overlap allows the network to produce an output faster than the window length. However, it introduces additional computations. This work explores the potential to optimize computational efficiency during inference by exploiting convolution's shift-invariance properties to skip the calculation of layer activations between successive overlapping windows. Although convolutions are shift-invariant, zero-padding and pooling operations, widely used in such networks, are not efficient and complicate efficient streaming inference. We introduce StreamiNNC, a strategy to deploy Convolutional Neural Networks for online streaming inference. We explore the adverse effects of zero padding and pooling on the accuracy of streaming inference, deriving theoretical error upper bounds for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Data Compression Techniques · Image and Signal Denoising Methods
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
