Detector With Focus: Normalizing Gradient In Image Pyramid
Yonghyun Kim, Bong-Nam Kang, Daijin Kim

TL;DR
This paper introduces a gradient normalization technique for image pyramids that reduces gradient variance caused by resampling, improving the accuracy of classifiers in multi-scale object detection tasks.
Contribution
It proposes a novel gradient normalization method that mitigates gradient variation in image pyramids, enhancing detection performance across multiple visual recognition problems.
Findings
Improved pedestrian detection accuracy
Enhanced pose estimation results
Better object detection performance
Abstract
An image pyramid can extend many object detection algorithms to solve detection on multiple scales. However, interpolation during the resampling process of an image pyramid causes gradient variation, which is the difference of the gradients between the original image and the scaled images. Our key insight is that the increased variance of gradients makes the classifiers have difficulty in correctly assigning categories. We prove the existence of the gradient variation by formulating the ratio of gradient expectations between an original image and scaled images, then propose a simple and novel gradient normalization method to eliminate the effect of this variation. The proposed normalization method reduce the variance in an image pyramid and allow the classifier to focus on a smaller coverage. We show the improvement in three different visual recognition problems: pedestrian detection,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
