Controllable Attention for Structured Layered Video Decomposition
Jean-Baptiste Alayrac, Jo\~ao Carreira, Relja Arandjelovi\'c and, Andrew Zisserman

TL;DR
This paper introduces a structured neural network architecture for layered video decomposition that allows explicit control over which layers to attend to, improving separation quality and enabling applications like reflection removal and action recognition.
Contribution
We propose a new neural network architecture that explicitly models layers as spatial masks, enhancing separation performance and controllability in video decomposition.
Findings
Improved layer separation over previous methods
Effective use of external cues like audio for control
Successful application to real-world tasks such as reflection removal
Abstract
The objective of this paper is to be able to separate a video into its natural layers, and to control which of the separated layers to attend to. For example, to be able to separate reflections, transparency or object motion. We make the following three contributions: (i) we introduce a new structured neural network architecture that explicitly incorporates layers (as spatial masks) into its design. This improves separation performance over previous general purpose networks for this task; (ii) we demonstrate that we can augment the architecture to leverage external cues such as audio for controllability and to help disambiguation; and (iii) we experimentally demonstrate the effectiveness of our approach and training procedure with controlled experiments while also showing that the proposed model can be successfully applied to real-word applications such as reflection removal and action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Advanced Image Processing Techniques
