# Precision cotton disease detection via transformer models applied to leaf imagery

**Authors:** Nikhil Inamdar, Manjunath Managuli, Ramesh Koti, Jagadish Jakati, Sharanappa P. H., Prasan Kulkarni

PMC · DOI: 10.3389/frai.2025.1743264 · Frontiers in Artificial Intelligence · 2026-02-09

## TL;DR

This paper presents a deep learning framework using transformer models to detect cotton leaf diseases with high accuracy, aiding in agricultural monitoring and crop health management.

## Contribution

The novel use of transformer-based architectures for cotton leaf disease classification with a stratified K-fold evaluation approach is introduced.

## Key findings

- Transformer models achieved up to 99.99% accuracy in classifying cotton leaf diseases.
- Stratified K-fold testing ensured robust evaluation and addressed class imbalance.
- Image augmentation and normalization improved model generalization and compatibility with transformer models.

## Abstract

There is great potential for improving agricultural research, ecological monitoring, and biodiversity conservation through computerized plant species cataloging utilizing leaf photos. This work introduces a deep learning-based framework that uses transformer-based architectures, such as the Vanilla Vision Transformer (ViT), Swin Transformer, DeiT (Data-Efficient Image Transformer), and T2T-ViT (Tokens-to-Tokens Vision Transformer), to automatically classify cotton leaf diseases. Images of cotton leaves from four different classes—curl virus, bacterial blight, fusarium wilt, and healthy leaves—make up the dataset. A stratified K-fold hold-out testing technique (K = 1 to 5) is used to maintain the class distribution across training and testing folds in order to guarantee robust model evaluation and address class imbalance. To improve generalization and guarantee compatibility with transformer models, standard image augmentation and normalizing approaches are used. All models begin training using vast collections of images, afterward honed specifically on cotton leaf data to sharpen their ability to tell differences apart. Results spread across multiple test rounds stay steady, one standout reaching nearly perfect accuracy—99.99 percent. This pattern highlights how transformer-driven systems thrive alongside stratified K-fold checks, crafting a dependable way to spot crop issues early, shifting farm oversight toward quicker, smarter responses.

## Full-text entities

- **Diseases:** Plant (MESH:D010939), bacterial blight (MESH:D001424), Self-attention block (MESH:D001289), infected (MESH:D007239), cotton disease (MESH:D004194), MLP (MESH:D015161)
- **Chemicals:** Swin (-), gold (MESH:D006046)
- **Species:** Glycine max (soybean, species) [taxon 3847]
- **Mutations:** T2T

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12926397/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12926397/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12926397/full.md

---
Source: https://tomesphere.com/paper/PMC12926397