CA-YOLO: Cross Attention Empowered YOLO for Biomimetic Localization
Zhen Zhang, Qing Zhao, Xiuhe Li, Cheng Wang, Guoqiang Zhu, Yu Zhang, Yining Huo, Hongyi Yu, Yi Zhang

TL;DR
This paper introduces CA-YOLO, a novel object detection model inspired by biological visual mechanisms, which improves small target recognition and localization accuracy in complex environments, validated by experiments on standard datasets.
Contribution
The paper presents CA-YOLO, integrating bionic modules and attention mechanisms into YOLO to enhance small target detection and localization accuracy.
Findings
Outperforms original YOLO on COCO and VisDrone datasets
Achieves 3.94% and 4.90% higher accuracy respectively
Demonstrates effectiveness in time-sensitive target localization
Abstract
In modern complex environments, achieving accurate and efficient target localization is essential in numerous fields. However, existing systems often face limitations in both accuracy and the ability to recognize small targets. In this study, we propose a bionic stabilized localization system based on CA-YOLO, designed to enhance both target localization accuracy and small target recognition capabilities. Acting as the "brain" of the system, the target detection algorithm emulates the visual focusing mechanism of animals by integrating bionic modules into the YOLO backbone network. These modules include the introduction of a small target detection head and the development of a Characteristic Fusion Attention Mechanism (CFAM). Furthermore, drawing inspiration from the human Vestibulo-Ocular Reflex (VOR), a bionic pan-tilt tracking control strategy is developed, which incorporates central…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAugmented Reality Applications · Image and Video Stabilization · Gaze Tracking and Assistive Technology
