CNN in MRF: Video Object Segmentation via Inference in A CNN-Based   Higher-Order Spatio-Temporal MRF

Linchao Bao; Baoyuan Wu; Wei Liu

arXiv:1803.09453·cs.CV·March 28, 2018

CNN in MRF: Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF

Linchao Bao, Baoyuan Wu, Wei Liu

PDF

Open Access

TL;DR

This paper introduces a CNN-based higher-order spatio-temporal MRF model for video object segmentation, leveraging CNN predictions for spatial dependencies and optical flow for temporal cues, with an innovative inference algorithm.

Contribution

It presents a novel CNN-embedded inference method for a higher-order spatio-temporal MRF, improving video object segmentation without additional detectors.

Findings

01

Outperforms DAVIS 2017 Challenge winners

02

Effective integration of CNN and optical flow

03

No need for model ensembling or dedicated detectors

Abstract

This paper addresses the problem of video object segmentation, where the initial object mask is given in the first frame of an input video. We propose a novel spatio-temporal Markov Random Field (MRF) model defined over pixels to handle this problem. Unlike conventional MRF models, the spatial dependencies among pixels in our model are encoded by a Convolutional Neural Network (CNN). Specifically, for a given object, the probability of a labeling to a set of spatially neighboring pixels can be predicted by a CNN trained for this specific object. As a result, higher-order, richer dependencies among pixels in the set can be implicitly modeled by the CNN. With temporal dependencies established by optical flow, the resulting MRF model combines both spatial and temporal cues for tackling video object segmentation. However, performing inference in the MRF model is very difficult due to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis