Probabilistic Attention for Interactive Segmentation
Prasad Gabbur, Manjot Bilkhu, Javier Movellan

TL;DR
This paper introduces a probabilistic interpretation of attention in transformers, enabling online adaptation through EM algorithms, which improves interactive segmentation performance by effectively propagating user feedback.
Contribution
It presents a novel probabilistic framework for attention, allowing online adaptation of model parameters for interactive tasks, enhancing annotation efficiency.
Findings
Key adaptation improves performance by ~10% mIoU in low feedback scenarios.
Value propagation enhances responsiveness in high feedback regimes.
Probabilistic attention outperforms standard methods on benchmark datasets.
Abstract
We provide a probabilistic interpretation of attention and show that the standard dot-product attention in transformers is a special case of Maximum A Posteriori (MAP) inference. The proposed approach suggests the use of Expectation Maximization algorithms for online adaptation of key and value model parameters. This approach is useful for cases in which external agents, e.g., annotators, provide inference-time information about the correct values of some tokens, e.g, the semantic category of some pixels, and we need for this new information to propagate to other tokens in a principled manner. We illustrate the approach on an interactive semantic segmentation task in which annotators and models collaborate online to improve annotation efficiency. Using standard benchmarks, we observe that key adaptation boosts model performance ( mIoU) in the low feedback regime and value…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Video Analysis and Summarization
