Predicting Future Instance Segmentation by Forecasting Convolutional Features
Pauline Luc, Camille Couprie, Yann LeCun, Jakob Verbeek

TL;DR
This paper introduces a novel method for future instance segmentation by forecasting convolutional features within the Mask R-CNN framework, outperforming existing baselines in predicting individual object segments in future video frames.
Contribution
The authors propose a new predictive model operating in the feature space of Mask R-CNN to forecast future instance segmentation, handling varying object counts effectively.
Findings
Significant improvement over optical flow baselines.
Effective handling of varying object counts in future frames.
Enhanced accuracy in predicting individual object segments.
Abstract
Anticipating future events is an important prerequisite towards intelligent behavior. Video forecasting has been studied as a proxy task towards this goal. Recent work has shown that to predict semantic segmentation of future frames, forecasting at the semantic level is more effective than forecasting RGB frames and then segmenting these. In this paper we consider the more challenging problem of future instance segmentation, which additionally segments out individual objects. To deal with a varying number of output labels per image, we develop a predictive model in the space of fixed-sized convolutional features of the Mask R-CNN instance segmentation model. We apply the "detection head'" of Mask R-CNN on the predicted features to produce the instance segmentation of future frames. Experiments show that this approach significantly improves over strong baselines based on optical flow and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods
MethodsRegion Proposal Network · Softmax · Convolution · RoIAlign · Mask R-CNN
