# Optimizing polymorphic tomato picking detection: improved YOLOv8n architecture to tackle data under complex environments

**Authors:** Qiang Li, Jie Mao, Pengxin Zhao, Qing Lv, Chao Fu

PMC · DOI: 10.3389/fpls.2025.1660480 · 2026-01-14

## TL;DR

This paper improves YOLOv8n to better detect ripe and small tomatoes in complex environments, enhancing accuracy and efficiency for automated harvesting.

## Contribution

The study introduces a modified YOLOv8n with SPD, PPA, and Detect_CBAM for improved tomato detection in challenging agricultural settings.

## Key findings

- The improved model achieved 89.6% precision and 87.3% recall for tomato detection.
- It outperformed YOLOv8n and other models in mAP@0.5 and mAP@0.5:0.95 metrics.
- The model provides reliable detection for ripe and small tomatoes under leaf occlusion and uneven lighting.

## Abstract

In modern agriculture, tomatoes, as key economic crops, face challenges during harvesting due to complex growth environments; traditional object detection technologies are limited by performance and struggle to accurately identify and locate ripe and small-target tomatoes under leaf occlusion and uneven illumination.

To address these issues, this study sets YOLOv8n as the baseline model, focusing on improving it to enhance performance per tomato detection’s core needs. First, it analyzes YOLOv8n’s inherent bottlenecks in feature extraction and small-target recognition, then proposes targeted schemes: specifically, to boost feature extraction, a Space-to-Depth convolution module (SPD) is introduced by restructuring convolutional operations; to improve small-target detection, a dedicated small-target detection layer is added and integrated with the Parallelized Patch-Aware Attention mechanism (PPA); meanwhile, to balance performance and efficiency, a lightweight Slim-Neck structure and a self-developed Detect_CBAM detection head are adopted; finally, the Distance-Intersection over Union loss function (DIoU) optimizes gradient distribution during training. Experiments are conducted on the self-built “tomato_dataset” (7,160 images, divided into 5,008 for training, 720 for validation, 1,432 for testing) with evaluation metrics including bounding box precision, recall, mAP@0.5, mAP@0.5:0.95, Parameters, and FLOPS, and performance comparisons made with mainstream YOLO models (YOLOv5n, YOLOv6n, YOLOv8n), lightweight models (SSD-MobileNetv2, EfficientDet-D0), and two-stage algorithms (Faster R-CNN, Cascade R-CNN).

Results show the improved model achieves 89.6% precision, 87.3% recall, 93.5% mAP@0.5, 58.6% mAP@0.5:0.95, significantly outperforming YOLOv8n and most comparative models, and the two-stage algorithms in both detection accuracy and efficiency.

In conclusion, this study solves detection problems of ripe and small-target tomatoes in polymorphic environments, improves the model’s accuracy and robustness, provides reliable technical support for automated harvesting, and contributes to modern agricultural intelligent development.

## Full-text entities

- **Species:** Solanum lycopersicum (tomato, species) [taxon 4081]

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12854074/full.md

---
Source: https://tomesphere.com/paper/PMC12854074