TL;DR
This paper introduces an unsupervised image segmentation method based on mutual information maximization between different views created by autoregressive masked convolutions, outperforming current state-of-the-art techniques.
Contribution
It presents a novel approach using autoregressive masked convolutions to generate diverse views for mutual information maximization in unsupervised segmentation.
Findings
Outperforms current state-of-the-art in unsupervised image segmentation
Simple and easy to implement approach
Can be extended to other visual tasks
Abstract
In this work, we propose a new unsupervised image segmentation approach based on mutual information maximization between different constructed views of the inputs. Taking inspiration from autoregressive generative models that predict the current pixel from past pixels in a raster-scan ordering created with masked convolutions, we propose to use different orderings over the inputs using various forms of masked convolutions to construct different views of the data. For a given input, the model produces a pair of predictions with two valid orderings, and is then trained to maximize the mutual information between the two outputs. These outputs can either be low-dimensional features for representation learning or output clusters corresponding to semantic labels for clustering. While masked convolutions are used during training, in inference, no masking is applied and we fall back to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution
