Mask R-CNN
Kaiming He, Georgia Gkioxari, Piotr Doll\'ar, Ross Girshick

TL;DR
Mask R-CNN is a simple, flexible framework that extends Faster R-CNN to perform high-quality object instance segmentation, achieving top results on COCO benchmarks with minimal additional complexity.
Contribution
It introduces Mask R-CNN, a novel extension of Faster R-CNN that adds a parallel mask prediction branch, enabling accurate instance segmentation alongside object detection.
Findings
Outperforms existing single-model methods on COCO tasks
Runs at 5 fps with minimal overhead
Easily generalizes to other tasks like pose estimation
Abstract
We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Face recognition and analysis
MethodsRegion Proposal Network · Average Pooling · ResNeXt Block · Grouped Convolution · Bottleneck Residual Block · Global Average Pooling · Residual Block · *Communicated@Fast*How Do I Communicate to Expedia? · Kaiming Initialization · Max Pooling
