CBAM-STN-TPS-YOLO: Enhancing Agricultural Object Detection through Spatially Adaptive Attention Mechanisms
Satvik Praveen, Yoonsung Jung

TL;DR
This paper introduces CBAM-STN-TPS-YOLO, a novel agricultural object detection model that combines non-rigid spatial transformations and attention mechanisms to improve accuracy in complex, occlusion-heavy scenarios.
Contribution
The paper presents a new model integrating Thin-Plate Splines with Spatial Transformer Networks and CBAM for enhanced non-rigid spatial alignment and feature emphasis in agricultural detection tasks.
Findings
Outperforms previous models in precision, recall, and mAP on PGP dataset
Reduces false positives by 12%
Supports real-time edge deployment
Abstract
Object detection is vital in precision agriculture for plant monitoring, disease detection, and yield estimation. However, models like YOLO struggle with occlusions, irregular structures, and background noise, reducing detection accuracy. While Spatial Transformer Networks (STNs) improve spatial invariance through learned transformations, affine mappings are insufficient for non-rigid deformations such as bent leaves and overlaps. We propose CBAM-STN-TPS-YOLO, a model integrating Thin-Plate Splines (TPS) into STNs for flexible, non-rigid spatial transformations that better align features. Performance is further enhanced by the Convolutional Block Attention Module (CBAM), which suppresses background noise and emphasizes relevant spatial and channel-wise features. On the occlusion-heavy Plant Growth and Phenotyping (PGP) dataset, our model outperforms STN-YOLO in precision, recall,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Agriculture and AI · Remote Sensing in Agriculture · Advanced Neural Network Applications
