Differentiable Soft-Masked Attention

Ali Athar; Jonathon Luiten; Alexander Hermans; Deva Ramanan; Bastian; Leibe

arXiv:2206.00182·cs.CV·August 8, 2022

Differentiable Soft-Masked Attention

Ali Athar, Jonathon Luiten, Alexander Hermans, Deva Ramanan, Bastian, Leibe

PDF

Open Access 1 Repo

TL;DR

This paper introduces a differentiable soft-masked attention mechanism for transformers, enabling learning of soft masks within the network, and applies it to weakly-supervised video object segmentation with promising results.

Contribution

We propose a novel differentiable soft-masked attention method that allows mask learning without direct supervision, enhancing weakly-supervised video object segmentation.

Findings

01

Effective segmentation in unlabeled frames due to novel attention formulation

02

Achieved weakly-supervised VOS with only one annotated image frame

03

Code available for implementation and further research

Abstract

Transformers have become prevalent in computer vision due to their performance and flexibility in modelling complex operations. Of particular significance is the 'cross-attention' operation, which allows a vector representation (e.g. of an object in an image) to be learned by attending to an arbitrarily sized set of input features. Recently, "Masked Attention" was proposed in which a given object representation only attends to those image pixel features for which the segmentation mask of that object is active. This specialization of attention proved beneficial for various image and video segmentation tasks. In this paper, we propose another specialization of attention which enables attending over `soft-masks' (those with continuous mask probabilities instead of binary values), and is also differentiable through these mask probabilities, thus allowing the mask used for attention to be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Ali2500/HODOR
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection

MethodsVOS