FOVEA: Foveated Image Magnification for Autonomous Navigation
Chittesh Thavamani, Mengtian Li, Nicolas Cebron, Deva Ramanan

TL;DR
FOVEA is a novel foveated image magnification technique that intelligently downscales high-resolution images by emphasizing salient regions, significantly improving small object detection in autonomous driving without extra computational cost.
Contribution
FOVEA introduces a differentiable resampling method that magnifies salient image regions based on simple cues, enhancing detection accuracy in real-time autonomous navigation systems.
Findings
Boosts detection AP over standard methods on autonomous datasets.
Improves small object detection accuracy by over 2x.
Sets a new streaming AP record with minimal computational overhead.
Abstract
Efficient processing of high-res video streams is safety-critical for many robotics applications such as autonomous driving. To maintain real-time performance, many practical systems downsample the video stream. But this can hurt downstream tasks such as (small) object detection. Instead, we take inspiration from biological vision systems that allocate more foveal "pixels" to salient parts of the scene. We introduce FOVEA, an approach for intelligent downsampling that ensures salient image regions remain "magnified" in the downsampled output. Given a high-res image, FOVEA applies a differentiable resampling layer that outputs a small fixed-size image canvas, which is then processed with a differentiable vision module (e.g., object detection network), whose output is then differentiably backward mapped onto the original image size. The key idea is to resample such that background pixels…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging
MethodsConvolution · Region Proposal Network · High-resolution input · RoIPool · Softmax · Faster R-CNN
