TL;DR
This paper introduces a novel encoder-decoder neural network with multi-scale feature fusion for foreground segmentation, achieving state-of-the-art results with minimal training data.
Contribution
It extends the FgSegNet's Feature Pooling Module to incorporate internal feature fusions, enabling robust multi-scale feature extraction without multi-scale inputs.
Findings
Outperforms existing methods on CDnet2014 with F-Measure of 0.9847
Effective on SBI2015 and UCSD datasets
Requires only few training examples
Abstract
Foreground segmentation algorithms aim segmenting moving objects from the background in a robust way under various challenging scenarios. Encoder-decoder type deep neural networks that are used in this domain recently perform impressive segmentation results. In this work, we propose a novel robust encoder-decoder structure neural network that can be trained end-to-end using only a few training examples. The proposed method extends the Feature Pooling Module (FPM) of FgSegNet by introducing features fusions inside this module, which is capable of extracting multi-scale features within images; resulting in a robust feature pooling against camera motion, which can alleviate the need of multi-scale inputs to the network. Our method outperforms all existing state-of-the-art methods in CDnet2014 dataset by an average overall F-Measure of 0.9847. We also evaluate the effectiveness of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
