HoughNet: Integrating near and long-range evidence for visual detection

Nermin Samet; Samet Hicsonmez; Emre Akbas

arXiv:2104.06773·cs.CV·August 19, 2022

HoughNet: Integrating near and long-range evidence for visual detection

Nermin Samet, Samet Hicsonmez, Emre Akbas

PDF

Open Access 3 Repos

TL;DR

HoughNet introduces a voting-based, anchor-free object detection method that combines near and long-range evidence, achieving competitive results across multiple visual recognition tasks by leveraging a generalized Hough Transform-inspired mechanism.

Contribution

The paper proposes HoughNet, a novel detection approach that integrates near and long-range evidence through a voting mechanism, enhancing detection accuracy over existing methods.

Findings

01

Achieves 46.4 AP on COCO dataset, competitive with state-of-the-art.

02

Improves performance in video detection, segmentation, 3D detection, and pose estimation.

03

Voting mechanism consistently boosts results across various tasks.

Abstract

This paper presents HoughNet, a one-stage, anchor-free, voting-based, bottom-up object detection method. Inspired by the Generalized Hough Transform, HoughNet determines the presence of an object at a certain location by the sum of the votes cast on that location. Votes are collected from both near and long-distance locations based on a log-polar vote field. Thanks to this voting mechanism, HoughNet is able to integrate both near and long-range, class-conditional evidence for visual recognition, thereby generalizing and enhancing current object detection methodology, which typically relies on only local evidence. On the COCO dataset, HoughNet's best model achieves $46.4$ $A P$ (and $65.1$ $A P_{50}$ ), performing on par with the state-of-the-art in bottom-up object detection and outperforming most major one-stage and two-stage methods. We further validate the effectiveness of our proposal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Image and Object Detection Techniques