# Deep learning models for deriving optimised measures of fat and muscle mass from MRI

**Authors:** Belvin Thomas, M. Adam Ali, Fatima M. H. Ali, Anthony Chung, Manjiri Joshi, Sophia Maiguma-Wilson, Gabrielle Reiff, Hadil Said, Pardis Zalmay, Michael Berks, Matthew D. Blackledge, James P. B. O’Connor

PMC · DOI: 10.1038/s41598-025-07867-w · Scientific Reports · 2025-07-17

## TL;DR

This paper compares deep learning models for measuring fat and muscle mass from MRI scans, finding that model performance varies by tissue type and model architecture.

## Contribution

The study systematically evaluates CNN and transformer-based models for quantifying fat and muscle mass from MRI, revealing performance differences across tissues and architectures.

## Key findings

- CNN-based models outperformed transformers in measuring intra-abdominal fat but underperformed in psoas muscle delineation.
- Human observers showed highest accuracy for intra-abdominal fat, while subcutaneous fat and external muscle had excellent repeatability across models.
- Model performance for fat and muscle mass varied by tissue type and gender, suggesting the need for careful model selection in clinical applications.

## Abstract

Fat and muscle mass are potential biomarkers of wellbeing and disease in oncology, but clinical measurement methods vary considerably. Here we evaluate the accuracy, precision and ability to track change for multiple deep learning (DL) models that quantify fat and muscle mass from abdominal MRI. Specifically, subcutaneous fat (SF), intra-abdominal fat (VF), external muscle (EM) and psoas muscle (PM) were evaluated using 15 convolutional neural network (CNN)-based and 4 transformer-based deep learning model architectures. There was negligible difference in the accuracy of human observers and all deep learning models in delineating SF or EM. Both of these tissues had excellent repeatability of their delineation. VF was measured most accurately by the human observers, then by CNN-based models, which outperformed transformer-based models. In distinction, PM delineation accuracy and repeatability was poor for all assessments. Repeatability limits of agreement determined when changes measured in individual patients were due to real change rather than test-retest variation. In summary, DL model accuracy and precision of delineating fat and muscle volumes varies between CNN-based and transformer-based models, between different tissues and in some cases with gender. These factors should be considered when investigators deploy deep learning methods to estimate biomarkers of fat and muscle mass.

## Full-text entities

- **Diseases:** VF (MESH:C537182)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12271309/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12271309/full.md

## References

14 references — full list in the complete paper: https://tomesphere.com/paper/PMC12271309/full.md

---
Source: https://tomesphere.com/paper/PMC12271309