# Multimodal cross-attention network for overgrowth detection in strawberry seedlings

**Authors:** Zhenzhen Cheng, Yifan Cheng, Tingting Fang, Man Zhu, Jing Liu, Peng Qi, Qiaoyu Zhang

PMC · DOI: 10.3389/fpls.2025.1706694 · Frontiers in Plant Science · 2026-01-02

## TL;DR

This paper introduces MM-CAPNet, a new framework that detects overgrowth in strawberry seedlings by combining environmental data and images, improving early warning accuracy.

## Contribution

The novelty lies in the image-guided cross-attention mechanism that links visual phenotypes to environmental data for early overgrowth detection.

## Key findings

- MM-CAPNet achieved 87.6% accuracy and 0.901 AUC in detecting overgrowth in strawberry seedlings.
- The cross-attention mechanism improves interpretability by linking visual symptoms to environmental factors.
- The framework supports precision agriculture by enabling early regulation of fertilization and irrigation.

## Abstract

Early warning of overgrowth in strawberry seedlings is essential to balance vegetative and reproductive growth. However, existing monitoring methods face major challenges, including subtle visual symptoms and limited abnormal samples. To address this, we propose MM-CAPNet, a multimodal fusion framework for early detection of seedling overgrowth. We first developed a representative sample collection of strawberry seedlings through a systematic induction experiment, integrating historical environmental time-series data with contemporaneous plant images. The MM-CAPNet architecture uses a dual-stream design to process these inputs, with a Transformer encoder for environmental sequences and a MobileNetV2 encoder for images. A critical component of the proposed framework lies in the image-guided Cross-Attention mechanism, which uniquely treats the current phenotype as an active query to adaptively retrieve and aggregate the most diagnostically relevant segments of past environmental data. Experiments show MM-CAPNet outperforms baselines, reaching 87.6% accuracy and 0.901 AUC, with strong discriminative ability for early overgrowth categories. Ablation studies confirm its interpretability by linking visual phenotypes to key environmental drivers. This work provides growers with a proof-of-concept framework to regulate fertilization, irrigation, and light management during the nursery stage, thereby reducing the risk of excessive vegetative growth. The proposed framework supports precision cultivation strategies that enhance resource efficiency and crop resilience.

## Full-text entities

- **Diseases:** chlorosis (MESH:D000747), necrosis (MESH:D009336), Early Overgrowth (MESH:C537340)
- **Chemicals:** CO2 (MESH:D002245), water (MESH:D014867)
- **Species:** Oryza sativa (Asian cultivated rice, species) [taxon 4530], Homo sapiens (human, species) [taxon 9606], Solanum lycopersicum (tomato, species) [taxon 4081], Brassica oleracea (wild cabbage, species) [taxon 3712], Fragaria x ananassa (strawberry, species) [taxon 3747], Glycine max (soybean, species) [taxon 3847]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12808427/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12808427/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/PMC12808427/full.md

---
Source: https://tomesphere.com/paper/PMC12808427