A convnet for non-maximum suppression
Jan Hosang, Rodrigo Benenson, Bernt Schiele

TL;DR
This paper introduces a convolutional neural network designed to improve non-maximum suppression in object detection, addressing limitations of traditional greedy methods and enhancing detection accuracy.
Contribution
The paper presents a novel convnet architecture for NMS that outperforms traditional greedy algorithms in recall and precision.
Findings
Convnet-based NMS achieves better recall and precision than greedy NMS.
Experimental results on synthetic and pedestrian detection datasets demonstrate improved performance.
The approach overcomes the trade-off inherent in fixed-threshold greedy NMS.
Abstract
Non-maximum suppression (NMS) is used in virtually all state-of-the-art object detection pipelines. While essential object detection ingredients such as features, classifiers, and proposal methods have been extensively researched surprisingly little work has aimed to systematically address NMS. The de-facto standard for NMS is based on greedy clustering with a fixed distance threshold, which forces to trade-off recall versus precision. We propose a convnet designed to perform NMS of a given set of detections. We report experiments on a synthetic setup, and results on crowded pedestrian detection scenes. Our approach overcomes the intrinsic limitations of greedy NMS, obtaining better recall and precision.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
