BlockCopy: High-Resolution Video Processing with Block-Sparse Feature Propagation and Online Policies
Thomas Verelst, Tinne Tuytelaars

TL;DR
BlockCopy introduces a novel method for accelerating video processing by selectively applying computations to important regions and copying features for others, significantly reducing FLOPS and latency while maintaining accuracy.
Contribution
The paper presents a universal, online reinforcement learning-based policy for efficient block-sparse feature propagation in video CNNs, improving speed without sacrificing accuracy.
Findings
Significant FLOPS reduction and speedup achieved.
Effective on multiple dense prediction tasks.
Minimal accuracy loss with the proposed method.
Abstract
In this paper we propose BlockCopy, a scheme that accelerates pretrained frame-based CNNs to process video more efficiently, compared to standard frame-by-frame processing. To this end, a lightweight policy network determines important regions in an image, and operations are applied on selected regions only, using custom block-sparse convolutions. Features of non-selected regions are simply copied from the preceding frame, reducing the number of computations and latency. The execution policy is trained using reinforcement learning in an online fashion without requiring ground truth annotations. Our universal framework is demonstrated on dense prediction tasks such as pedestrian detection, instance segmentation and semantic segmentation, using both state of the art (Center and Scale Predictor, MGAN, SwiftNet) and standard baseline networks (Mask-RCNN, DeepLabV3+). BlockCopy achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition
