Learning a Layout Transfer Network for Context Aware Object Detection
Tao Wang, Xuming He, Yuanzheng Cai, Guobao Xiao

TL;DR
This paper introduces a context-aware object detection approach that leverages scene layout retrieval and transformation to improve detection accuracy in complex environments like traffic surveillance and autonomous driving.
Contribution
It proposes a novel Layout Transfer Network that integrates scene layout reasoning into Faster R-CNN, enhancing detection performance with a retrieve-and-transform scene layout model.
Findings
Improves object detection accuracy on public datasets.
Enhances scene understanding with interpretable layout features.
Achieves consistent performance gains over state-of-the-art methods.
Abstract
We present a context aware object detection method based on a retrieve-and-transform scene layout model. Given an input image, our approach first retrieves a coarse scene layout from a codebook of typical layout templates. In order to handle large layout variations, we use a variant of the spatial transformer network to transform and refine the retrieved layout, resulting in a set of interpretable and semantically meaningful feature maps of object locations and scales. The above steps are implemented as a Layout Transfer Network which we integrate into Faster RCNN to allow for joint reasoning of object detection and scene layout estimation. Extensive experiments on three public datasets verified that our approach provides consistent performance improvements to the state-of-the-art object detection baselines on a variety of challenging tasks in the traffic surveillance and the autonomous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Spatial Transformer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam
