Segmenting the motion components of a video: A long-term unsupervised   model

Etienne Meunier; Patrick Bouthemy

arXiv:2310.01040·cs.CV·April 18, 2024·1 cites

Segmenting the motion components of a video: A long-term unsupervised model

Etienne Meunier, Patrick Bouthemy

PDF

Open Access

TL;DR

This paper introduces a novel unsupervised long-term spatio-temporal model using a transformer architecture to segment coherent motion components in videos, emphasizing temporal consistency and sequence-wide segmentation.

Contribution

It presents a new transformer-based framework leveraging ELBO and polynomial/B-spline models for unsupervised, sequence-wide motion segmentation with improved temporal consistency.

Findings

01

Competitive results on four VOS benchmarks.

02

Performs motion segmentation on entire sequences in one step.

03

Enhances temporal consistency in motion segmentation.

Abstract

Human beings have the ability to continuously analyze a video and immediately extract the motion components. We want to adopt this paradigm to provide a coherent and stable motion segmentation over the video sequence. In this perspective, we propose a novel long-term spatio-temporal model operating in a totally unsupervised way. It takes as input the volume of consecutive optical flow (OF) fields, and delivers a volume of segments of coherent motion over the video. More specifically, we have designed a transformer-based network, where we leverage a mathematically well-founded framework, the Evidence Lower Bound (ELBO), to derive the loss function. The loss function combines a flow reconstruction term involving spatio-temporal parametric motion models combining, in a novel way, polynomial (quadratic) motion models for the spatial dimensions and B-splines for the time dimension of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Advanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis

MethodsVOS