HoughNet: Integrating near and long-range evidence for bottom-up object   detection

Nermin Samet; Samet Hicsonmez; Emre Akbas

arXiv:2007.02355·cs.CV·July 27, 2020

HoughNet: Integrating near and long-range evidence for bottom-up object detection

Nermin Samet, Samet Hicsonmez, Emre Akbas

PDF

2 Repos

TL;DR

HoughNet introduces a voting-based, bottom-up object detection method that integrates near and long-range evidence, achieving competitive results on COCO and improving image generation tasks when combined with GANs.

Contribution

HoughNet presents a novel voting mechanism inspired by the Generalized Hough Transform for improved bottom-up object detection.

Findings

01

Achieves 46.4 AP on COCO, competitive with state-of-the-art methods.

02

Effectively integrates near and long-range evidence for detection.

03

Improves image generation accuracy when integrated with GAN models.

Abstract

This paper presents HoughNet, a one-stage, anchor-free, voting-based, bottom-up object detection method. Inspired by the Generalized Hough Transform, HoughNet determines the presence of an object at a certain location by the sum of the votes cast on that location. Votes are collected from both near and long-distance locations based on a log-polar vote field. Thanks to this voting mechanism, HoughNet is able to integrate both near and long-range, class-conditional evidence for visual recognition, thereby generalizing and enhancing current object detection methodology, which typically relies on only local evidence. On the COCO dataset, HoughNet's best model achieves 46.4 $A P$ (and 65.1 $A P_{50}$ ), performing on par with the state-of-the-art in bottom-up object detection and outperforming most major one-stage and two-stage methods. We further validate the effectiveness of our proposal in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods1x1 Convolution · Average Pooling · Dilated Convolution · Residual Connection · Pyramid Pooling Module · Dropout · Batch Normalization · Concatenated Skip Connection · Pix2Pix · Deformable Convolution