# Variability of the Penn upper motor neuron score in amyotrophic lateral sclerosis: need for a revised score

**Authors:** Anna B. Jacobsen, Gaia Fanella, Mamede de Carvalho, Martin Koltzenburg, Miguel Oliveira Santos, Bülent Cengiz, Jakob Blicher, Izabella Obál, Mia B. Heintzelmann, Wilfred Nix, Jean-Philippe Camdessanché, Anders Fuglsang-Frederiksen, Hatice Tankisi

PMC · DOI: 10.1007/s00415-025-12895-7 · Journal of Neurology · 2025-02-15

## TL;DR

This study assesses the reliability of a clinical scale used to evaluate upper motor neuron signs in ALS patients and suggests the need for an updated score.

## Contribution

The study evaluates inter-rater reliability of the Penn upper motor neuron score and identifies signs with the highest reliability.

## Key findings

- The total PUMNS showed good inter-rater reliability with an ICC of 0.81.
- Hoffman's sign, Babinski's sign, clonus, and deep tendon reflexes had the highest inter-rater reliability.
- Facial reflex and crossed adduction had the lowest inter-rater reliability.

## Abstract

There is a need for a consensus on a clinical scale for evaluating upper motor neuron (UMN) burden in amyotrophic lateral sclerosis (ALS) to improve consistency in clinical diagnosis, research and monitoring of disease progression. The Penn upper motor neuron score (PUMNS) is the most commonly published scale, however, the reliability of the scale has only been evaluated in a single study involving two raters. The objective of this study was to evaluate the inter-rater reliability of the PUMNS in ALS patients among multiple raters, and to discuss an updated UMN score including the signs with the highest inter-rater reliability. This study included seven ALS patients (mean age: 71 ± 11.5, six males, one female). Each patient was evaluated with the PUMNS by eight raters from different centers blinded to previous observations. The intra-class correlation coefficient (ICC) was calculated to assess the inter-rater reliability of the total PUMNS. The inter-rater reliability of the binary subscores was assessed with Gwet’s AC1 coefficient. The inter-rater agreement for the total PUMNS yielded an ICC of 0.81 (95% CI 0.56;0.96). Items with the highest inter-rater reliability included Hoffman's sign, Babinski's sign, clonus and deep tendon reflexes, while the facial reflex (Gwet’s AC1 −0.038 (95% CI −0.25,0.18)) and crossed adduction (0.18 (95% CI (−0.32,0.67)) had the lowest inter-rater reliability. In conclusion, PUMNS demonstrated good inter-rater reliability overall, while some of the subscores had poor inter-rater reliability. Based on this, we call for an updated UMN score to enhance diagnostic accuracy and research consistency in ALS.

The online version contains supplementary material available at 10.1007/s00415-025-12895-7.

## Linked entities

- **Diseases:** amyotrophic lateral sclerosis (MONDO:0004976)

## Full-text entities

- **Diseases:** ALS (MESH:D000690), UMN (MESH:D016472)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11829849/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11829849/full.md

---
Source: https://tomesphere.com/paper/PMC11829849