# MedNeXt for accurate medical image classification and segmentation: A lightweight transformer-style convolutional neural network

**Authors:** Ziqing Xue, Pengpeng Pi, Ziyi Liu, Zhaomu Zeng, Zhiwei Sun

PMC · DOI: 10.1371/journal.pone.0340108 · PLOS One · 2026-01-05

## TL;DR

This paper introduces MedNeXt, a lightweight neural network for medical image tasks that balances performance and efficiency without using transformers.

## Contribution

The novel contribution is a lightweight CNN-based architecture using large kernels and MLPs to replace transformers for efficient medical image analysis.

## Key findings

- MedNeXt achieves high accuracy (98.39% on SARS-COV2-CT-Scan) across five medical datasets.
- The model balances performance and efficiency with low computational cost and good generalizability.
- It performs well even with limited training data and on diverse medical tasks.

## Abstract

Transformer-based deep learning architectures have achieved notable success across various medical image analysis tasks, driven by the global modeling capabilities of the self-attention mechanism. However, Transformer-based methods exhibit significant computational complexity and a large number of parameters, rendering them challenging to apply effectively in practical medical scenarios. Compared with Transformers, large-kernel Convolutional Neural Networks (CNNs) and Multi-Layer Perceptrons (MLPs) offer more efficient inference while retaining global contextual awareness. Therefore, we rethink the role of large-kernel CNNs and MLPs in medical image analysis and leverage them to replace the heavy self-attention operation, to strike a better balance between performance and efficiency. Specifically, we propose backbone models for medical image classification and segmentation, featured by three lightweight modules: Linear Attention Feed Forward Network (FFN) for enhancing lesion features, Spatial Encoding Module for integrating multi-scale lesion information, and Smooth Depth-Wise Convolution (DwConv) FFN for efficient interaction of channel features. Composed solely of lightweight convolutional and MLP operations, our method achieves a better balance between performance and efficiency, validated by the superior performances on five datasets with varying data scales and diseases, with 98.39% on SARS-COV2-CT-Scan, 98.12% on Monkeypox Skin Lesion Dataset, 98.58% on Large COVID-19-CT scan slice, 79.45% on Synapse and 91.28% on ACDC. The low computational cost, high-performance with limited training data, and generalizability to various of medical tasks make the proposed method a promising and practical solution for medical image classification and segmentation.

## Linked entities

- **Diseases:** SARS-COV2 (MONDO:0100096), Monkeypox (MONDO:0002594), COVID-19 (MONDO:0100096)

## Full-text entities

- **Diseases:** lesion (MESH:D009059), COVID-19 (MESH:D000086382), Skin Lesion (MESH:D012871)
- **Species:** Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12768253/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12768253/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/PMC12768253/full.md

---
Source: https://tomesphere.com/paper/PMC12768253