TL;DR
This paper presents a novel end-to-end trainable framework for pixel-level grouping tasks like instance segmentation, using hyper-spherical embeddings and a differentiable mean-shift clustering module, achieving improved results over existing methods.
Contribution
The introduction of a hyper-spherical embedding space and a recurrent, differentiable mean-shift clustering module for pixel grouping tasks is a novel approach.
Findings
Significant improvements in instance segmentation performance.
Effective grouping for boundary detection and semantic segmentation.
Theoretical analysis of embedding dimension and margin choices.
Abstract
We introduce a differentiable, end-to-end trainable framework for solving pixel-level grouping problems such as instance segmentation consisting of two novel components. First, we regress pixels into a hyper-spherical embedding space so that pixels from the same group have high cosine similarity while those from different groups have similarity below a specified margin. We analyze the choice of embedding dimension and margin, relating them to theoretical results on the problem of distributing points uniformly on the sphere. Second, to group instances, we utilize a variant of mean-shift clustering, implemented as a recurrent neural network parameterized by kernel bandwidth. This recurrent grouping module is differentiable, enjoys convergent dynamics and probabilistic interpretability. Backpropagating the group-weighted loss through this module allows learning to focus on only correcting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
