Multi-view X-ray R-CNN
Jan-Martin O. Steitz, Faraz Saeedan, Stefan Roth

TL;DR
This paper presents a novel multi-view X-ray object detection method using a CNN with a geometry-aware pooling layer, improving accuracy and efficiency in security screening applications.
Contribution
It introduces a multi-view pooling layer exploiting known geometry and an end-to-end trainable multi-view detection pipeline based on Faster R-CNN.
Findings
Significant accuracy improvements over single-view detection.
More efficient than multiple single-view detections.
Effective 3D feature aggregation from multi-view data.
Abstract
Motivated by the detection of prohibited objects in carry-on luggage as a part of avionic security screening, we develop a CNN-based object detection approach for multi-view X-ray image data. Our contributions are two-fold. First, we introduce a novel multi-view pooling layer to perform a 3D aggregation of 2D CNN-features extracted from each view. To that end, our pooling layer exploits the known geometry of the imaging system to ensure geometric consistency of the feature aggregation. Second, we introduce an end-to-end trainable multi-view detection pipeline based on Faster R-CNN, which derives the region proposals and performs the final classification in 3D using these aggregated multi-view features. Our approach shows significant accuracy gains compared to single-view detection while even being more efficient than performing single-view detection in each view.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRegion Proposal Network · Convolution · RoIPool · Softmax · Faster R-CNN
