Enhancing Geometric Factors in Model Learning and Inference for Object   Detection and Instance Segmentation

Zhaohui Zheng; Ping Wang; Dongwei Ren; Wei Liu; Rongguang; Ye; Qinghua Hu; Wangmeng Zuo

arXiv:2005.03572·cs.CV·July 6, 2021·83 cites

Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation

Zhaohui Zheng, Ping Wang, Dongwei Ren, Wei Liu, Rongguang, Ye, Qinghua Hu, Wangmeng Zuo

PDF

Open Access 5 Repos

TL;DR

This paper introduces CIoU loss and Cluster-NMS, novel methods that incorporate geometric factors into object detection and segmentation, improving accuracy and efficiency without sacrificing inference speed.

Contribution

The paper proposes CIoU loss and Cluster-NMS, which enhance geometric considerations in bounding box regression and NMS, leading to consistent performance improvements.

Findings

01

Achieved +1.7 AP and +6.2 AR on MS COCO with YOLACT

02

Cluster-NMS reduces iterations and improves detection accuracy

03

Methods are efficient and compatible with state-of-the-art models

Abstract

Deep learning-based object detection and instance segmentation have achieved unprecedented progress. In this paper, we propose Complete-IoU (CIoU) loss and Cluster-NMS for enhancing geometric factors in both bounding box regression and Non-Maximum Suppression (NMS), leading to notable gains of average precision (AP) and average recall (AR), without the sacrifice of inference efficiency. In particular, we consider three geometric factors, i.e., overlap area, normalized central point distance and aspect ratio, which are crucial for measuring bounding box regression in object detection and instance segmentation. The three geometric factors are then incorporated into CIoU loss for better distinguishing difficult regression cases. The training of deep models using CIoU loss results in consistent AP and AR improvements in comparison to widely adopted $ℓ_{n}$ -norm loss and IoU-based loss.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsConvolution · Non Maximum Suppression · 1x1 Convolution · SSD