Exploring Person Context and Local Scene Context for Object Detection
Saurabh Gupta, Bharath Hariharan, Jitendra Malik

TL;DR
This paper investigates how person-specific and scene context can enhance object detection accuracy, especially for small and challenging objects, by modeling spatial relationships and appearance cues.
Contribution
It introduces two context-aware models that leverage person-object interactions and scene-object relationships, improving detection performance on the COCO dataset.
Findings
Up to 5% relative improvement over CNN-based detectors.
10% relative improvement on small objects.
Effective use of spatial and appearance context.
Abstract
In this paper we explore two ways of using context for object detection. The first model focusses on people and the objects they commonly interact with, such as fashion and sports accessories. The second model considers more general object detection and uses the spatial relationships between objects and between objects and scenes. Our models are able to capture precise spatial relationships between the context and the object of interest, and make effective use of the appearance of the contextual region. On the newly released COCO dataset, our models provide relative improvements of up to 5% over CNN-based state-of-the-art detectors, with the gains concentrated on hard cases such as small objects (10% relative improvement).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Visual Attention and Saliency Detection
