# Comparative Evaluation of DeepLabCut Convolutional Neural Network Architectures for High-Precision Markerless Tracking in the Mouse Staircase Test

**Authors:** Valentin Fernandez, Landoline Bonnin, Afsaneh Gaillard, Christine Fernandez-Maloigne

PMC · DOI: 10.3390/bioengineering13020215 · Bioengineering · 2026-02-13

## TL;DR

This study compares different neural network designs for tracking mouse movements in a staircase test, finding that some architectures offer better accuracy and efficiency.

## Contribution

The study provides the first systematic comparison of nine CNN architectures in DeepLabCut for markerless tracking in a mouse staircase test.

## Key findings

- Multi-scale DLCRNet architectures outperformed conventional backbones in tracking accuracy.
- DLCRNet_ms5 achieved the highest overall accuracy, while DLCRNet_stride16_ms5 balanced precision and efficiency best.
- The study evaluated performance across spatial accuracy, robustness, inference speed, and GPU memory usage.

## Abstract

Precise quantification of fine motor behaviour is essential for understanding neural circuit function and for evaluating therapeutic interventions in neurological disorders. While markerless pose estimation frameworks such as DeepLabCut (DLC) have transformed behavioural phenotyping, the choice of convolutional neural network (CNN) backbone has a critical impact on tracking performance, particularly in tasks involving small distal joints and frequent occlusions. In this study, we present the first systematic comparison of nine CNN architectures implemented in DLC for lateral-view analysis of skilled reaching movements in the Montoya Staircase test, a gold-standard assay for forelimb dexterity in rodent models of stroke and neurodegenerative disease. Using a dataset comprising both control and primary motor cortex (M1)–lesioned mice, we evaluated model performance across six key dimensions: spatial accuracy (RMSE, PCK@5 px), mean average precision (mAP), robustness to occlusions, inference speed, and GPU memory usage. Our results demonstrate that multi-scale DLCRNet architectures substantially outperform conventional backbones. DLCRNet_ms5 achieved the highest overall accuracy, while DLCRNet_stride16_ms5 provided the most favourable balance between precision and computational efficiency. These findings provide practical methodological guidance for neuroscience laboratories and highlight the importance of CNN architecture selection for the reliable quantification of fine motor behaviour in preclinical research.

## Linked entities

- **Diseases:** stroke (MONDO:0005098), neurodegenerative disease (MONDO:0005559)
- **Species:** Mus musculus (taxon 10090)

## Full-text entities

- **Diseases:** motor deficits (MESH:D009461), stroke (MESH:D020521), reduced digit extension (MESH:C567101), traumatic brain injury (MESH:D000070642), neural dysfunction (MESH:D015441), Alzheimer's (MESH:D000544), Parkinson's disease (MESH:D010300), neurodegenerative disease (MESH:D019636), injury to (MESH:D014947), motor (MESH:D000068079), grasp stability (MESH:D043171), occlusions (MESH:D001157)
- **Chemicals:** DLC (-)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** C57BL/6 — Mus musculus (Mouse), Transformed cell line (CVCL_C0MU)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12937725/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12937725/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/PMC12937725/full.md

---
Source: https://tomesphere.com/paper/PMC12937725