TL;DR
This paper introduces clockwork convnets, a novel video segmentation framework that leverages the slow evolution of semantic content to reduce computation and improve efficiency in video recognition tasks.
Contribution
The paper proposes a new clockwork scheduling approach for convnets that adapts processing rates based on semantic stability, enabling faster and more efficient video segmentation.
Findings
Achieves real-time recognition with reduced latency.
Reduces overall computation while maintaining accuracy.
Effective on multiple video datasets.
Abstract
Recent years have seen tremendous progress in still-image segmentation; however the na\"ive application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video. We propose a video recognition framework that relies on two key observations: 1) while pixels may change rapidly from frame to frame, the semantic content of a scene evolves more slowly, and 2) execution can be viewed as an aspect of architecture, yielding purpose-fit computation schedules for networks. We define a novel family of "clockwork" convnets driven by fixed or adaptive clock signals that schedule the processing of different layers at different update rates according to their semantic stability. We design a pipeline schedule to reduce latency for real-time recognition and a fixed-rate schedule to reduce overall computation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
