Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation
Zhaohui Zheng, Ping Wang, Dongwei Ren, Wei Liu, Rongguang, Ye, Qinghua Hu, Wangmeng Zuo

TL;DR
This paper introduces CIoU loss and Cluster-NMS, novel methods that incorporate geometric factors into object detection and segmentation, improving accuracy and efficiency without sacrificing inference speed.
Contribution
The paper proposes CIoU loss and Cluster-NMS, which enhance geometric considerations in bounding box regression and NMS, leading to consistent performance improvements.
Findings
Achieved +1.7 AP and +6.2 AR on MS COCO with YOLACT
Cluster-NMS reduces iterations and improves detection accuracy
Methods are efficient and compatible with state-of-the-art models
Abstract
Deep learning-based object detection and instance segmentation have achieved unprecedented progress. In this paper, we propose Complete-IoU (CIoU) loss and Cluster-NMS for enhancing geometric factors in both bounding box regression and Non-Maximum Suppression (NMS), leading to notable gains of average precision (AP) and average recall (AR), without the sacrifice of inference efficiency. In particular, we consider three geometric factors, i.e., overlap area, normalized central point distance and aspect ratio, which are crucial for measuring bounding box regression in object detection and instance segmentation. The three geometric factors are then incorporated into CIoU loss for better distinguishing difficult regression cases. The training of deep models using CIoU loss results in consistent AP and AR improvements in comparison to widely adopted -norm loss and IoU-based loss.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsConvolution · Non Maximum Suppression · 1x1 Convolution · SSD
