Learning non-maximum suppression

Jan Hosang; Rodrigo Benenson; Bernt Schiele

arXiv:1705.02950·cs.CV·May 10, 2017·36 cites

Learning non-maximum suppression

Jan Hosang, Rodrigo Benenson, Bernt Schiele

PDF

Open Access

TL;DR

This paper introduces a neural network architecture to perform non-maximum suppression (NMS) as part of end-to-end object detection, aiming to improve localization and occlusion handling over traditional hand-crafted methods.

Contribution

A novel neural network-based NMS method that replaces the standard greedy algorithm, enhancing detection accuracy and robustness.

Findings

01

Improved localization accuracy on PETS and COCO datasets

02

Better occlusion handling compared to traditional NMS

03

Potential for more integrated object detection pipelines

Abstract

Object detectors have hugely profited from moving towards an end-to-end learning paradigm: proposals, features, and the classifier becoming one neural network improved results two-fold on general object detection. One indispensable component is non-maximum suppression (NMS), a post-processing algorithm responsible for merging all detections that belong to the same object. The de facto standard NMS algorithm is still fully hand-crafted, suspiciously simple, and -- being based on greedy clustering with a fixed distance threshold -- forces a trade-off between recall and precision. We propose a new network architecture designed to perform NMS, using only boxes and their score. We report experiments for person detection on PETS and for general object categories on the COCO dataset. Our approach shows promise providing improved localization and occlusion handling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques