# PGCNet: a Transformer–CNN hybrid segmentation model for pine wilt disease identification

**Authors:** Jiying Liu, Yaping Zhang, Xu Chen

PMC · DOI: 10.3389/fpls.2026.1760648 · 2026-01-28

## TL;DR

PGCNet is a new model combining CNNs and Transformers to accurately identify pine wilt disease from drone images, improving accuracy and efficiency for real-time monitoring.

## Contribution

PGCNet introduces a novel hybrid architecture with a progressive fusion module and lightweight feature enhancement for efficient disease segmentation.

## Key findings

- PGCNet outperforms existing models in segmentation accuracy and computational efficiency.
- The model excels in identifying small disease targets and handling complex backgrounds.
- It is suitable for edge computing and real-time forestry monitoring.

## Abstract

Pine wilt disease, often referred to as the “cancer of pine trees,” is characterized by its rapid spread and extremely high mortality rate, posing a severe threat to forest ecosystems. Currently, most automatic identification methods for pine wilt disease based on UAV remote sensing imagery rely on a single architecture of Convolutional Neural Networks (CNNs) or Transformer, which suffer from limitations such as restricted receptive fields, insufficient global context modeling, and loss of local details. Existing fusion strategies typically adopt simple stacking or parallel designs without an effective hierarchical feature interaction mechanism, resulting in inadequate integration of semantic and detailed information, as well as high computational overhead, which hinders their deployment in edge computing environments. To address these issues, this study proposes PGCNet, a semantic segmentation model that efficiently fuses CNN and Transformer representations. The model employs CSWin Transformer as the backbone network to capture comprehensive global contextual information. A Progressive Guidance Fusion Module (PGFM) is designed to achieve effective cross-layer fusion of semantic and detailed features through a spatial–channel collaborative attention mechanism. Furthermore, a lightweight Context-Aware Residual Atrous Spatial Pyramid Pooling module (CAR-ASPP) is introduced to enhance multi-scale feature representation while significantly reducing the number of parameters and computational complexity. Experimental results demonstrate that PGCNet outperforms mainstream semantic segmentation models across multiple evaluation metrics, showing especially strong performance in scenarios with complex background interference and small-scale disease target identification. The proposed model achieves high accuracy with excellent computational efficiency, offering a practical solution for real-time monitoring and edge deployment of forestry disease detection, and exhibiting strong potential for extension to agricultural remote sensing disease identification tasks.

## Full-text entities

- **Diseases:** Pine wilt disease (MESH:D004194), cancer (MESH:D009369)

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12891098/full.md

---
Source: https://tomesphere.com/paper/PMC12891098