Fast Semantic Segmentation on Video Using Block Motion-Based Feature Interpolation
Samvit Jain, Joseph E. Gonzalez

TL;DR
This paper introduces a fast, efficient method for semantic segmentation on video that leverages block motion vectors and feature interpolation to significantly speed up processing while maintaining accuracy.
Contribution
The authors propose a novel two-part approach combining block motion-based feature propagation and feature interpolation for accelerated video segmentation.
Findings
Achieves near real-time segmentation at 20.1 fps on large images.
Provides nearly 6x speedup over single-frame baseline.
Maintains competitive accuracy with significantly faster inference.
Abstract
Convolutional networks optimized for accuracy on challenging, dense prediction tasks are prohibitively slow to run on each frame in a video. The spatial similarity of nearby video frames, however, suggests opportunity to reuse computation. Existing work has explored basic feature reuse and feature warping based on optical flow, but has encountered limits to the speedup attainable with these techniques. In this paper, we present a new, two part approach to accelerating inference on video. First, we propose a fast feature propagation technique that utilizes the block motion vectors present in compressed video (e.g. H.264 codecs) to cheaply propagate features from frame to frame. Second, we develop a novel feature estimation scheme, termed feature interpolation, that fuses features propagated from enclosing keyframes to render accurate feature estimates, even at sparse keyframe…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Advanced Neural Network Applications
