# LGF‐Net: A multi‐scale feature fusion network for thyroid nodule ultrasound image classification

**Authors:** Yao Xiao, Yan Zhuang, Wenwu Ling, Shouyu Jiang, Ke Chen, Guoliang Liao, Yuhua Xie, Yao Hou, Lin Han, Zhan Hua, Yan Luo, Jiangli Lin

PMC · DOI: 10.1002/acm2.70149 · 2025-07-25

## TL;DR

This paper introduces LGF-Net, a new network that improves thyroid nodule classification in ultrasound images by combining local and global features effectively.

## Contribution

The novel LGF-Net model integrates CNN and Transformer branches with a dual-path mechanism to capture both fine-grained and spatial features for better classification.

## Key findings

- LGF-Net outperforms state-of-the-art methods on both public and private thyroid nodule datasets.
- The model achieves high accuracy (91.24%) on a private clinical dataset, showing strong generalization.
- Ablation studies and visualization confirm the effectiveness and interpretability of the model components.

## Abstract

Thyroid cancer is one of the most common cancers in clinical practice, and accurate classification of thyroid nodule ultrasound images is crucial for computer‐aided diagnosis. Models based on a convolutional neural network (CNN) or a transformer struggle to integrate local and global features, which impacts the recognition accuracy.

Our method is designed to capture both the key local fine‐grained features and the global spatial features essential for thyroid nodule diagnosis simultaneously. It adapts to the irregular morphology of thyroid nodules, dynamically focuses on the key pixel‐level regions of thyroid nodules, and thereby improves the model's recognition accuracy and generalization ability.

The proposed multi‐scale fusion model, the local and global feature fusion network (LGF‐Net), inspired by the dual‐path mechanism of human visual diagnosis, consists of two branches: a CNN branch and a Transformer branch. The CNN branch integrates the wavelet transform and deformable convolution module (WTDCM) to enhance the model's ability to capture discriminative local features and recognize fine‐grained textures. By introducing the aggregated attention (AA) mechanism, which mimics biological vision, into the Transformer branch, spatial features are effectively captured. The adaptive feature fusion module (FFM) is then utilized to integrate the multi‐scale features of thyroid nodules, further improving classification performance.

We evaluated our model on the public thyroid nodule classification dataset (TNCD) and a private clinical dataset using accuracy, recall, precision, and F1‐score. On TNCD, the model achieved 81.50%, 79.51%, 79.92%, and 79.70%, respectively. On the private dataset, it reached 91.24%, 88.90%, 90.73%, and 89.73%, respectively. These results outperformed state‐of‐the‐art methods. We also conducted ablation studies and visualization analysis to validate the model's components and interpretability.

The experiments demonstrate that our method improves the accuracy of thyroid nodule recognition, shows its strong generalization ability and potential for clinical application, and provides interpretability for clinicians' diagnoses.

## Linked entities

- **Diseases:** thyroid cancer (MONDO:0002108)

## Full-text entities

- **Diseases:** Thyroid cancer (MESH:D013964), cancers (MESH:D009369), thyroid nodule (MESH:D016606)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12301082/full.md

---
Source: https://tomesphere.com/paper/PMC12301082