Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks
Sean Bell, C. Lawrence Zitnick, Kavita Bala, Ross Girshick

TL;DR
The Inside-Outside Net (ION) enhances object detection by integrating inside-region multi-scale features with outside-region contextual information using skip pooling and recurrent neural networks, achieving state-of-the-art results.
Contribution
This paper introduces ION, a novel object detection framework that combines inside and outside contextual information with multi-scale features, improving detection accuracy.
Findings
Improves PASCAL VOC 2012 mAP from 73.9% to 76.4%.
Raises MS COCO mAP from 19.7% to 33.1%.
Won Best Student Entry at MS COCO Detection Challenge.
Abstract
It is well known that contextual and multi-scale representations are important for accurate visual recognition. In this paper we present the Inside-Outside Net (ION), an object detector that exploits information both inside and outside the region of interest. Contextual information outside the region of interest is integrated using spatial recurrent neural networks. Inside, we use skip pooling to extract information at multiple scales and levels of abstraction. Through extensive experiments we evaluate the design space and provide readers with an overview of what tricks of the trade are important. ION improves state-of-the-art on PASCAL VOC 2012 object detection from 73.9% to 76.4% mAP. On the new and more challenging MS COCO dataset, we improve state-of-art-the from 19.7% to 33.1% mAP. In the 2015 MS COCO Detection Challenge, our ION model won the Best Student Entry and finished 3rd…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
