MOVE: Unsupervised Movable Object Segmentation and Detection
Adam Bielski, Paolo Favaro

TL;DR
MOVE is an unsupervised object segmentation method that leverages local object shifts and realistic image transformations to achieve state-of-the-art results without annotations.
Contribution
It introduces a novel unsupervised segmentation approach using object shifts, self-supervised features, inpainting, and adversarial training, surpassing previous methods.
Findings
7.2% improvement in CorLoc over SotA in single object discovery
53% relative AP improvement in class-agnostic object detection
State-of-the-art performance on multiple unsupervised segmentation datasets
Abstract
We introduce MOVE, a novel method to segment objects without any form of supervision. MOVE exploits the fact that foreground objects can be shifted locally relative to their initial position and result in realistic (undistorted) new images. This property allows us to train a segmentation model on a dataset of images without annotation and to achieve state of the art (SotA) performance on several evaluation datasets for unsupervised salient object detection and segmentation. In unsupervised single object discovery, MOVE gives an average CorLoc improvement of 7.2% over the SotA, and in unsupervised class-agnostic object detection it gives a relative AP improvement of 53% on average. Our approach is built on top of self-supervised features (e.g. from DINO or MAE), an inpainting network (based on the Masked AutoEncoder) and adversarial training.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Layer Normalization · Linear Layer · Dense Connections · Residual Connection · Vision Transformer · Inpainting
