# Thyroid Nodule Detection and Classification on Small Datasets: An Ensemble Deep Learning Approach with Attention Mechanism and Focal Loss

**Authors:** Wei-Chen Hung, Yi-Kai Chang, Chih-Ming Chang, Po-Wen Cheng, Wu-Chia Lo, Ping-Chia Cheng, Li-Jen Liao

PMC · DOI: 10.3390/diagnostics16060825 · 2026-03-10

## TL;DR

This paper introduces a deep learning framework for thyroid nodule classification using attention mechanisms and focal loss to address limited data and class imbalance.

## Contribution

The novel contribution is an ensemble deep learning approach combining YOLO-based detection and ResNet18 with attention for thyroid nodule classification on small datasets.

## Key findings

- The model achieved 85.4% accuracy on the independent test set with high sensitivity and specificity.
- External validation showed 77.8% accuracy, indicating good generalizability despite limited data.
- The lightweight ResNet18 outperformed deeper networks on small medical datasets.

## Abstract

Background: Thyroid nodule classification on ultrasound remains challenging due to limited labeled data and marked class imbalance. This study proposes an integrated deep learning framework combining YOLO-based region-of-interest detection with an enhanced ResNet18 classifier. Methods: A total of 522 thyroid ultrasound images from 522 patients examined between July 2020 and June 2024 were included. The dataset comprised 467 images for training (399 benign, 68 malignant), 41 for independent testing (19 benign, 22 malignant), and 14 for internal validation (4 benign, 10 malignant). An external validation set of 36 images (22 benign, 14 malignant) was collected from online sources. ResNet18 with a convolutional block attention module was used to enhance feature extraction. To address small sample size and class imbalance, the training pipeline incorporated focal loss, weighted random sampling, mixup augmentation, cosine annealing learning rate scheduling, and a 5-fold cross-validation ensemble. Results: The ensemble model achieved 85.4% accuracy (95% CI: 74.5–96.2%), 86.4% sensitivity (95% CI: 72.0–100%), and 84.2% specificity (95% CI: 67.8–100%) on the independent test set. Internal validation yielded 85.7% accuracy, 90.0% sensitivity, and 75.0% specificity, while external validation demonstrated 77.8% accuracy, 78.6% sensitivity, and 77.3% specificity. These findings suggest that advanced regularization combined with ensemble learning improves generalizability despite limited data. Conclusions: This study demonstrates that a lightweight ResNet18 architecture with strategic optimization outperforms deeper networks on small medical datasets. The proposed framework demonstrated good diagnostic performance across multiple validation cohorts, offering a promising computer-aided diagnosis tool for thyroid nodule assessment.

## Full-text entities

- **Diseases:** Thyroid Nodule (MESH:D016606)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13025762/full.md

---
Source: https://tomesphere.com/paper/PMC13025762