Improving case-detection of wasting among under-five-year-old children in Ethiopia: A secondary analysis of community-based surveys in humanitarian settings

Alinoor Mohamed Farah; Hamid Yimam Hassen; Sibhatu Biadgilign; Aweke Kebede; Yakob Desalegn; Beza Yilma; Tesfamichael Awoke; Samson Gebremedhin; Kemeria Barsenga; Tafara Ndumiyana; Robert Ackatia-Armah; Helina Tufa; Firaol Bekele; Hailu Wondim; Eskeziaw Abebe; Seifu Hagos Gebreyesus; Jason Morgan; Julia Robinson

PMC · DOI:10.1371/journal.pgph.0005952·March 27, 2026

Improving case-detection of wasting among under-five-year-old children in Ethiopia: A secondary analysis of community-based surveys in humanitarian settings

Alinoor Mohamed Farah, Hamid Yimam Hassen, Sibhatu Biadgilign, Aweke Kebede, Yakob Desalegn, Beza Yilma, Tesfamichael Awoke, Samson Gebremedhin, Kemeria Barsenga, Tafara Ndumiyana, Robert Ackatia-Armah, Helina Tufa, Firaol Bekele, Hailu Wondim, Eskeziaw Abebe

PDF

Open Access

TL;DR

This study examines how using different MUAC cutoffs can improve or mislead the detection of child wasting in Ethiopia, showing that a one-size-fits-all approach may not be effective.

Contribution

The study identifies age- and region-specific MUAC cutoffs for detecting wasting in Ethiopia, challenging the use of a universal threshold.

Findings

01

The optimal MUAC cutoff for detecting WHZ <−2 was 139 mm nationally, with varying thresholds by age and region.

02

Using a higher MUAC cutoff improves detection but increases false positives, showing a sensitivity-specificity trade-off.

03

Regional variations in optimal MUAC thresholds suggest the need for context-specific approaches in humanitarian settings.

Abstract

WHO recommends weight-for-height Z-score (WHZ) <-3 or Mid-Upper Arm Circumference (MUAC) < 125 mm as criteria for diagnosing wasting in children aged 6–59 months. In humanitarian settings, MUAC provides a simpler alternative than WHZ measurements, requiring only a tape measure. However, using MUAC alone may miss many at-risk children, causing discrepancies in wasting estimates. We analyzed 31 Standardized Monitoring and Assessment of Relief and Transitions (SMART) surveys from 2022-2025 across eight Ethiopian regions. The sample included 23,419 children aged 6–59 months with complete data. MUAC’s diagnostic performance for GAM was evaluated using WHZ as reference, applying standard parameters including sensitivity, specificity, PPV, NPV, Youden index, and ROC curve analyses. We assessed MUAC cut-offs’ accuracy nationally by age group (6–23 and 24–59 months) and administrative region.…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases2

wasting GAM

Figures5

Click any figure to enlarge with its caption.

Fig 1 — Histograms of weight-for-height z-score (WHZ) by region, Ethiopia.

Fig 2 — Histograms of MUAC by region, Ethiopia.

Fig 3 — Scatter plots for Mid-Upper Arm Circumference (MUAC) vs. Weight-for-Height Z-score (WHZ) by region, Ethiopia.

Fig 4 — Receiver’s operating characteristic curve of MUAC for GAM against weight-for-height by region, Ethiopia.

Fig 5 — Receiver’s operating characteristic curve of MUAC for SAM against weight-for-height by region, Ethiopia.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsChild Nutrition and Water Access · Iron Metabolism and Disorders · Child Nutrition and Feeding Issues

Full text

Introduction

Childhood malnutrition remains a major global health challenge, affecting millions of children under five years of age in low- and middle-income countries [1–3]. Wasting, defined as low weight-for-height, indicates acute malnutrition among children under five [3], resulting from poor dietary intake, infections, and socioeconomic conditions, leading to developmental delays and increased mortality risk when severe [2]. According to the World Health Organization (WHO) and United Nations Children’s Fund (UNICEF) (2023), wasting affected 6.8% or 45 million children under 5 years of age globally in 2022, with 13.6 million (2.1%) suffering severe wasting. Over 75% of children with severe wasting live in Asia and 22% in Africa [2]. Children with wasting face a higher mortality risk, especially when they are stunted or underweight [4]. The global community’s Sustainable Development Goals (SDGs) aim to reduce wasting to <5% by 2025 and <3% by 2030 [5]. According to the Ethiopia Mini-Demographic and Health Survey (EMDHS) 2019, 7% of Ethiopian children are wasted. The highest prevalence of wasted children is in Somali (21%), Afar (14%), and Gambella (13%), while the lowest is in Addis Ababa (2%) and Harari (4%) [6].

Early identification and treatment of wasting in children are critical for improving health outcomes and reducing mortality among vulnerable populations [7–9]. Wasting can be categorized as severe, with a weight-for-height Z-score (WHZ) <-3 or mid-upper arm circumference <115 mm, or moderate, with WHZ between -2 and -3 or mid-upper arm circumference (MUAC) between 115 and 125 mm [10,11]. Global Acute Malnutrition (GAM) includes both moderate and severe forms and thresholds are WHZ<−2 or MUAC < 125 mm [12]. Although MUAC is a more effective predictor of mortality in both community [13,14] and hospitalized settings [15], and can be effectively implemented by minimally trained personnel [16].. Moreover, the current WHO-recommended MUAC cutoffs (115 mm for severe and 125 mm for moderate wasting) have limited sensitivity, identifying only 6–20% of severe cases and 13–23% of moderate cases across different settings [17].

Compounding these measurement challenges, the concordance between MUAC and WHZ in identifying wasted children varies substantially across studies. While some studies have found moderate agreement between MUAC and WHZ [18,19], others have reported poor concordance [20,21]. MUAC and WHZ often identify different sets of malnourished children. MUAC tends to be more sensitive in identifying younger children and girls. These differences likely stem from the fact that MUAC uses a single cut-off to define wasting and may over diagnose the condition in some sub-populations while underestimating cases in older children, particularly boys [22]. WHZ-based diagnosis would be less sensitive to this bias as it standardizes for sex and height, thus age [22]. Some studies have explored MUAC-for-age (MUACZ) as an alternative, finding improved agreement with WHZ compared with MUAC alone [23,24]. However, the benefits of MUACZ remain inconclusive. Given the discrepancies between MUAC and WHZ, some researchers have suggested using both criteria for comprehensive malnutrition screening [25], especially in areas with a high prevalence of chronic malnutrition and kwashiorkor [24].

In fragile humanitarian contexts, where precise WHZ measurements require specialized equipment and trained personnel, MUAC’s simplicity (requiring only a tape measure) makes it invaluable for rapid screening [17,26]. However, relying on MUAC alone risks missing many at-risk children and underestimating the prevalence of wasting, thereby increasing mortality from undiagnosed cases [17,26,27]. In Ethiopia, health posts commonly use MUAC as the sole criterion for identifying and managing wasting because of logistical challenges in obtaining accurate length/height data. However, given that MUAC and WHZ classify different groups of children as wasted, with misclassification rates that fluctuate by region and population, the relationship between GAM by MUAC and GAM by WHZ can diverge markedly. Therefore, this study aimed to evaluate the diagnostic performance of MUAC in identifying acute malnutrition among Ethiopian children aged 6–59 months nationally and document variation across age groups and among regions in Ethiopia.

Methods

Source of data

The study drew on 31 livelihood zones and district-level surveys conducted between 2022 and 2025 across eight regions: Amhara, Afar, Benishangul-Gumuz, Gambella, Oromia, Somali, Southern and Tigray. All the datasets were obtained on August 01, 2025. All surveys were population-based and designed to be representative of the livelihood zone and district level, following the Standardized Monitoring and Assessment of Relief and Transitions (SMART) methodology. They were implemented to assess the need for emergency nutrition programs, led by the Government through the Emergency Nutrition Coordination Unit (ENCU) in collaboration with the implementing partners.

A two-stage cluster sampling approach was applied in all surveys, with probability proportional to population size used in the first stage, consistent with the SMART methodology. Anthropometric measurements were collected using standardized instruments. Children’s height was measured using UNICEF height/length boards with a precision of 1mm, while weight was measured using Seca scales and recorded to the nearest 0.1 kg.

Enumerators underwent training and a standardization test to ensure measurement quality. The SMART methodology includes standardized enumerator training modules that typically last five days. During this period, enumerators receive both theoretical and practical sessions and complete a standardization test to ensure they can accurately measure children. If enumerators fail the test, the survey cannot proceed until they retake and pass it. The training also includes a pilot phase, where enumerators practice conducting the survey before the actual data collection begins. For children without official documentation of birth dates, a local events calendar was used to estimate their age in months.

Data processing

Raw datasets from anthropometric surveys, including information on age, sex, weight, height/length and MUAC, were obtained from the SMART+ aggregator, a repository where non-governmental organizations (NGOs), United Nations (UN) agencies, and governments publish SMART+ survey data with prior approval from the Ethiopian Disaster Risk Management Commission (EDRMC), and Ethiopia Emergency Nutrition Coordination Unit (ENCU). Data cleaning was conducted using the Emergency Nutrition Assessment (ENA) for the SMART software (version 2020). Z-scores were calculated according to the WHO 2006 growth reference, and SMART flags (±3 z-scores) were applied to exclude implausible WHZ values from the analyses. The overall quality of the datasets was verified using the ENA plausibility check to ensure compliance with SMART quality standards. MUAC data were retained without exclusion criteria.

Following data cleaning, the datasets were imported into R software (version 4.5.1) for further analysis. Nutritional status was classified according to the standard definitions of the WHO. Wasting by WHZ was defined as <−2 z-scores and wasting by MUAC as <125 mm. Global Acute Malnutrition (GAM) was defined as WHZ <−2.0 or MUAC <125 mm. Severe Acute Malnutrition (SAM) was classified as WHZ <−3.0 or MUAC <115 mm. Moderate Acute Malnutrition (MAM) was defined as WHZ between −3.0 and −2.0 or MUAC between 115 mm and <125 mm. Cases of bilateral pitting edema were not included in estimated prevalence of wasting by WHZ or MUAC; edema cases were relatively rare, representing approximately 0.05% of the total cases. WHZ was missing for 1,108 children, 1,095 cases with missing height or weight, and 13 cases with missing age. Thus, all WHZ-missing records were excluded because the required inputs for the WHZ calculation were unavailable.

Statistical analysis

Data were analyzed to assess the comparability and diagnostic performance of MUAC compared with WHZ in identifying acute malnutrition across regions and age groups. Descriptive statistics (means, standard deviations, and proportions) were calculated for anthropometric indicators WHZ, MUAC, Height-for-Age Z-score (HAZ), and Weight-for-Age Z-score (WAZ), stratified by region. Group differences were examined by comparing the mean values and prevalence estimates.

The relationship between MUAC and WHZ was evaluated using linear regression models, with WHZ as the dependent variable and MUAC as the predictor. Scatter plots and region-specific regression coefficients were used to illustrate the strength and direction of the associations. Goodness-of-fit was assessed using the coefficient of determination (R²), and slopes were compared across regions to evaluate variability in predictive strength.

The prevalence of MAM and SAM was estimated using the WHO standard cut-offs for both MUAC and WHZ. Agreement between MUAC- and WHZ-based classifications was assessed using cross-tabulations, inflation factors (ratio of WHZ-based to MUAC-based caseloads), and Cohen’s kappa statistics. Kappa statistics were used to assess the level of agreement between MUAC and WHZ measurements in identifying children classified as wasted. Cohen proposed that Kappa values should be understood in the following way: values ≤ 0 reflects no agreement; 0.01–0.20 suggest none to slight, 0.21–0.40 as fair agreement, 0.41– 0.60 represents a moderate level of agreement, 0.61–0.80 as substantial, and 0.81–1.00 signals an almost perfect agreement [28,29]. Pearson’s correlation coefficients were calculated to evaluate the strength and direction of the linear relationship between MUAC and WHZ, thereby assessing the correlation between the anthropometric indicators. The strength of Pearson’s correlation coefficient is commonly interpreted using specific ranges. Values close to ±1 indicate a perfect correlation, showing that the variables move together in the same or opposite direction. Coefficients between ±0.50 and ±0.99 were considered high, reflecting a strong relationship. Values from ±0.30 to ±0.49 suggest a moderate association, indicating a medium level of connection. Values below ±0.29 imply a weak or low relationship, while a value of zero indicates no correlation, meaning there is no linear relationship between the variables [29,30].

The diagnostic performance of MUAC in identifying WHZ-defined wasting was examined using receiver operating characteristic (ROC) curve analysis, with area under the curve (AUC) values computed for each region, enabling the identification of variations in diagnostic accuracy across specific subpopulations. The area under the curve (AUC) was used as a measure of diagnostic accuracy, ranging from 0.5 (no better than chance) to 1 (perfect accuracy). Optimal cut-offs were determined using Youden’s index (Sensitivity+Specificity−1). This index ranges from 0 to 1, with higher values indicating better diagnostic performance. The MUAC cutoff with the highest Youden’s Index was selected as the optimal threshold because it maximized the combined sensitivity and specificity. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and likelihood ratios (LR ⁺ , LR⁻) were calculated for each cutoff value. Sensitivity was defined as the proportion of wasted children (based on the gold standard WHZ) correctly identified by the MUAC cutoff, and specificity was defined as the proportion of non-wasted children correctly classified as not wasted. The positive predictive value (PPV) represents the likelihood that children classified as wasted by MUAC truly have wasting according to WHZ, whereas the negative predictive value (NPV) represents the likelihood that those classified as not wasted by MUAC truly do not have wasting by WHZ. Accuracy reflects the overall proportion of children correctly classified as wasted or not wasted when MUAC was compared with WHZ. Likelihood ratios were also calculated: the positive likelihood ratio (LR⁺) indicates how much more likely a positive MUAC result is in children with wasting by WHZ compared to those without, and the negative likelihood ratio (LR⁻) indicates how much less likely a negative MUAC result is in children with wasting compared to those without. Analyses were stratified by region and age group (<24 and ≥24 months) to capture the heterogeneity in MUAC performance. False-positive rates (FPR) and false-negative rates (FNR) were estimated across a range of MUAC thresholds to quantify misclassification risks when using MUAC compared to WHZ. Regional variation in FPR/FNR profiles was highlighted to identify settings in which MUAC underestimates or overestimates the burden of wasting. All analyses were conducted using R (version 4.4.1).

Ethics statement

This was a secondary analysis of anonymous data in which no individual cluster or village location could be identified; therefore, formal ethical clearance was not required. Permission to use and analyze the dataset was obtained from the EDRMC and ENCU, which provided raw datasets.

Result

Characteristics of surveys and study population

As shown in Table 1, 31 SMART+ surveys were included, covering 23,419 children across eight Ethiopian regions. The highest number of surveys was conducted in the Somali Region (10 surveys, 7,341 children), followed by Oromia (6 surveys, 4,571 children), Amhara (5 surveys, 3,578 children), and Tigray (5 surveys, 3,033 children). Fewer surveys were conducted in Afar (two surveys, 1,483 children), Benishangul-Gumuz (one survey, 537 children), Gambella (one survey, 1,140 children), and Southern Ethiopia (one survey, 1,736 children). The median age of the participants was 32 months (IQR: 20–46). The sample was nearly evenly split by sex, with 11,867 (51%) boys and 11,552(49%) girls.

Table 1: Distribution of surveys and study participants by region.

Descriptive summary of anthropometric indicators by region

As presented in Table 2, the overall mean WHZ across all eight regions was –0.81, with regional variation. The lowest WHZ was recorded in Somali (–1.02), followed by Afar (–0.97) and Tigray (–0.87). The highest WHZ was found in Oromia (–0.48).

Table 2: Descriptive summary of WHZ and MUAC, by region, Ethiopia.

The overall mean MUAC was 142.09 mm. The lowest regional mean MUAC was reported in Tigray (134.31 mm), whereas the highest was reported in Gambella (146.55 mm). Somali, despite recording the lowest WHZ, had a relatively high mean MUAC (145.83 mm).

Relationship between MUAC and WHZ by region

The histograms of WHZ by region (Fig 1) show approximately normal distributions, with Somali and Tigray displaying left-skewed patterns that indicate lower WHZ values than the other regions. MUAC distributions (Fig 2) are also roughly normal, with most regions concentrated around 140–150 mm. Somali and Gambella show broader MUAC spreads, whereas Tigray exhibits a left skewed distribution. Both measures revealed regional variations in the acute nutritional status of children across Ethiopia.

Histograms of weight-for-height z-score (WHZ) by region, Ethiopia.

Histograms of MUAC by region, Ethiopia.

The scatter plots of MUAC versus WHZ by region (Fig 3) demonstrate a positive linear relationship across all regions, suggesting that higher MUAC values are generally associated with higher WHZ. The strength of this relationship varies by region. Benishangul-Gumuz exhibits the strongest correlation (R² = 0.43), followed by Tigray, Afar, Amhara, and South Ethiopia, which display moderate correlations (R² = 0.32–0.36). Oromia presented a weaker correlation (R² = 0.29), whereas Somali and Gambella showed the lowest correlations (R² = 0.23 and 0.21, respectively). The regression slopes differed across regions. Tigray and Benishangul-Gumuz exhibited steeper slopes (0.0566), indicating that small changes in MUAC corresponded to larger changes in WHZ than in other regions.

Scatter plots for Mid-Upper Arm Circumference (MUAC) vs. Weight-for-Height Z-score (WHZ) by region, Ethiopia.

Nutritional status of children by WHZ and MUAC indicators

Table 3 presents the regional distribution of wasting among children in Ethiopia using MUAC and WHZ, along with the inflation factor (WHZ/MUAC). The prevalence of GAM varied across regions. In Tigray, MUAC (23.8%) was higher than WHZ (18.2%), whereas in Afar (MUAC 9.0%, WHZ 15.8%), Oromia (MUAC 6.0%, WHZ 6.7%), and South Ethiopia (MUAC 6.9%, WHZ 14.6%), MUAC underestimated GAM relative to WHZ. In Amhara (MUAC 13.2%, WHZ 10.6%) and Benishangul-Gumuz (MUAC 7.8%, WHZ 6.9%), MUAC slightly overestimated GAM. The largest differences in prevalence were observed in Somali (MUAC 2.8%, WHZ 15.8%) and Gambella (MUAC 2.4%, WHZ 9.6%), where MUAC identified only a small proportion of children classified as malnourished by WHZ..

Table 3: Distribution of wasting by MUAC and WHZ by region, Ethiopia.

Table 4 compares regional malnutrition classifications using WHZ and MUAC and shows the agreement between indicators using kappa statistics. Across regions, discrepancies existed between WHZ- and MUAC-based classification of children into MAM, Normal, and SAM categories. In Tigray, the distribution was similar between WHZ and MUAC, with 14.5% classified as MAM by WHZ versus 19.6% by MUAC, showing moderate agreement (kappa = 0.453). Afar and Somali showed low agreement, with kappa values of 0.211 and 0.128, respectively, indicating substantial differences in classification between the indicators. Amhara (kappa = 0.388) and Benishangul-Gumuz (0.365) showed fair agreement, while Oromia (0.260), Gambella (0.117), and Southern Ethiopia (0.325) had low to fair agreement. WHZ classified more children as MAM in Somali, Gambella, and Southern Ethiopia, whereas MUAC classified fewer children as malnourished in these regions. The kappa statistics showed that the concordance between WHZ and MUAC varied by region, with the highest agreement in Tigray.

Table 4: Concordance between WHZ and MUAC varied region with Kappa statistics, Ethiopia.

MUAC diagnostic performance

Fig 4 shows the ROC curves assessing the diagnostic performance of MUAC in detecting wasting defined by WHZ in different regions of Ethiopia. The Area Under the Curve (AUC) values indicate overall accuracy. MUAC performed best in Benishangul-Gumuz (AUC = 0.892) and Tigray (0.865), showing high sensitivity and specificity. South Ethiopia (0.849), Amhara (0.847), and Oromia (0.829) also exhibited good discriminative abilities, respectively. Afar (0.819), Gambella (0.807), and Somali (0.778) had comparatively lower AUCs, indicating moderate predictive performance. Overall, MUAC showed varying effectiveness in identifying wasted children across regions, performing better in most regions of northern and central Ethiopia than in some eastern and southern regions. Similarly, the accuracy of MUAC in identifying severe wasting cases across regions is generally a reliable tool for identifying severe wasting across regions, with variations in accuracy (Fig 5). Its diagnostic performance was high in Benishangul Gumuz (0.991) and Oromia (0.953), and slightly lower but still strong in Amhara (0.858) and Gambella (0.872).

Receiver’s operating characteristic curve of MUAC for GAM against weight-for-height by region, Ethiopia.

Receiver’s operating characteristic curve of MUAC for SAM against weight-for-height by region, Ethiopia.

MUAC optimal threshold

The MUAC optimal threshold in detecting wasting based on WHZ varied across Ethiopian regions (Table 5). The optimal MUAC thresholds differed across regions, ranging from 126.5 mm in Tigray to 143.5 mm in Gambella. Sensitivity was lowest in Somali (0.70) and highest in Gambella (0.83), while specificity ranged from 0.66 in Gambella to 0.86 in Benishangul-Gumuz. This reflects MUAC’s varying effectiveness in identifying wasted children as classified by WHZ across regions. The positive predictive values were generally low, highest in Tigray at 0.49, whereas the negative predictive values were consistently high across regions, exceeding 0.93 and reaching 0.98 in Oromia and Benishangul-Gumuz. The accuracy ranged from 0.67 in Gambella to 0.85 in Benishangul-Gumuz, with the highest Youden Index in Benishangul-Gumuz (0.642) and the lowest in Somali (0.41), indicating the strongest performance in Benishangul-Gumuz and Tigray, while Somali showed weaker diagnostic efficiency.

Table 5: MUAC optimal threshold of wasting by region in Ethiopia, using WHZ as the Gold Standard.

Table 6 indicates that MUAC shows strong diagnostic performance for detecting severe wasting across regions, using WHZ as the gold standard. AUC values ranged from 0.858 in Amhara to 0.991 in Benishangul-Gumuz. Sensitivity and specificity were consistently high, and NPVs were nearly perfect (≈1.00), indicating that MUAC reliably ruled out non-severe cases. The Youden Index was highest in Benishangul-Gumuz (1.962 at MUAC 122.5 mm), showing excellent diagnostic performance at a low cutoff. In contrast, Amhara had the lowest index (1.588 at MUAC 130.5 mm), suggesting weaker utility despite a higher threshold.

Table 6: Diagnostic performance of MUAC for severe wasting by region in Ethiopia, using WHZ as the Gold Standard.

Optimal cut-offs

The diagnostic performance of MUAC in detecting wasting varied by age (Table 7), with WHZ serving as the gold standard. For children <24 months, low cutoffs (120–124 mm) had high specificity (0.92–0.97) but low sensitivity (0.31–0.51), with moderate PPVs (0.53–0.64) and high NPVs (0.89–0.91). Higher cutoffs (132–140 mm) increased sensitivity (0.83–0.96) but reduced specificity (0.40–0.45). NPVs remained high (>0.95), while PPVs fell below 0.25. The Youden Index peaked at 127–133 mm (0.528–0.545), identifying 128 mm as the optimal cutoff.

Table 7: Diagnostic performance metrics across various MUAC cutoffs by age category among children with wastingin Ethiopia, using WHZ as the Gold Standard.

For children ≥24 months, the sensitivity was lower across the cutoffs. At 120–124 mm, sensitivity was 0.06–0.15 and specificity was 0.99, while higher cutoffs (136–145 mm) improved sensitivity (0.52–0.83) but reduced specificity to 0.57. NPVs remained high (≥0.89), and PPVs were modest (0.21–0.65), with the Youden Index peaking at 135–140 mm, with an optimal cutoff of 140 mm.

MUAC’s diagnostic performance for severe wasting in Ethiopian children varies by age (Table 8). For children under 24 months, sensitivity increased from 0.21 at 111 mm to 0.98 at 137 mm, whereas specificity decreased from 0.99 to 0.46. PPV remained low (0.05–0.47), whereas NPV remained high (0.98–1.00), indicating an accurate classification of non-wasted children. Youden’s index peaked at 125 mm (0.664), showing optimal sensitivity-specificity balance.

Table 8: Diagnostic performance metrics across various MUAC cutoffs by age category in children with severe wasting in Ethiopia, using WHZ as the Gold Standard.

For children ≥24 months, the sensitivity increased from 0.04 at 111 mm to 0.78 at 137 mm, whereas the specificity decreased from 1.00 to 0.78. PPV remained low (0.05–0.39), NPV high (0.99–1.00). Youden’s index peaked at 135 mm (0.560). Younger children (<24 months) were better identified at lower MUAC thresholds (125 mm), whereas older children required higher cutoffs (135 mm). Cutoffs of 125–135 mm provided an optimal sensitivity-specificity balance across age groups.

MUAC misclassifications

Table 9 shows the false-positive rate (FPR) and false-negative rate (FNR) for MUAC at three cutoff points (<125 mm, < 130 mm, and <140 mm) across Ethiopian regions, using WHZ as the gold standard. These findings show trade-offs between misclassification errors when applying different MUAC thresholds: At the < 125 mm cutoff, false-positive rates (FPRs) were low, ranging from 1.2% in Somali to 13.4% in Tigray, indicating that few well-nourished children were misclassified as moderately wasted, based on WHZ classification (MAM-WHZ). However, FNRs were high, exceeding 40% in most regions and reaching 88.5% in Somali and 88.1% in Gambella. This cutoff would miss over two-thirds of nationally MAM-WHZ (66% overall), undermining its screening utility. At <130 mm, FPRs increased modestly, from 3.7% in Gambella to 27.9% in Tigray, as more children were incorrectly identified as MAM-WHZ. FNRs decreased substantially, falling to 13.8% in Tigray and 27.0% in Benishangul-Gumuz. Nationally, the average FNR dropped to 51.5%, meaning that half of the MAM-WHZ was missed, but fewer than at <125 mm. This cutoff reduced missed cases but falsely labeled one in ten well-nourished children as MAM-WHZ. At <140 mm, FPRs rose sharply, reaching 61.6% in Tigray and 48.1% in Amhara, meaning nearly half of well-nourished children would be misclassified MAM-WHZ. FNRs dropped significantly, with most regions below 20%, except for Somali (38.5%) and Gambella (35.8%). The national FNR was 21.6%, indicating that fewer than one in four MAM-WHZ were missed. However, the high national FPR of 36.2% means that over one-third of healthy children would be falsely identified as MAM-WHZ cases, potentially overburdening the supplementation programs.

Table 9: False-Positive Rate (FPR) and False-Negative Rate (FNR) across regions at various MUAC cutoff in estimating MAM-WHZ, Ethiopia.

Table 10 presents the FPR and FNR of MUAC across Ethiopian regions at cutoffs from 115 mm to 118 mm for estimating severe wasting based on WHZ (SAM-WHZ). The results show that increasing MUAC thresholds reduce missed cases (FNR) but increase false positives (FPR). At 115 mm, FPRs were low across regions, from 0.2% in Somali to 2.5% in Tigray, indicating that few well-nourished children were misclassified as having SAM-WHZ. However, FNRs were high, with most regions missing over half of the true SAM-WHZ cases. Somali and Gambella missed over 90% of SAM-WHZ cases (FNR = 92.1% and 92.3%), while Benishangul-Gumuz missed 42.9%. Nationally, the FNR was 70.8%, indicating that over two-thirds of the SAM-WHZ cases were undetected. At 116 mm, the FPRs increased slightly (to 0.3–2.8%), but the FNRs declined modestly. Tigray’s FNR fell from 51.4% to 49.5%, whereas Oromia’s dropped from 62.2% to 60.0%. Some regions maintained high SAM-WHZ misclassification rates: Somali (89.0%) and Gambella (92.3%). The national FNR was 67.9%. At 117 mm, FPRs increased slightly (up to 3.3% in Tigray), whereas FNRs decreased. Tigray reduced its FNR to 46.8%, Oromia to 55.6%, and southern Ethiopia to 37.5%. Nationally, the FNR dropped to 64.9%, but Somali (89.0%) and Gambella (84.6%) continued to miss most SAM-WHZ cases. At 118 mm, the FPRs increased marginally, with all regions below 4%. The FNRs showed greater improvement. Southern Ethiopia maintained the lowest FNR (37.5%), and Tigray reduced to 45.0%. The national FNR decreased to 63.3%. Somali (88.2%) and Gambella (84.6%) remained poor performers, missing nearly 9 in 10 SAM-WHZ cases.

Table 10: False-Positive Rate (FPR) and False-Negative Rate (FNR) across regions at various MUAC cutoff in estimating SAM-WHZ, Ethiopia.

Discussion

Our analysis sought to investigate the diagnostic accuracy of MUAC for detecting wasting among Ethiopian children aged 6–59 months and explore the potential explanations for the discrepancies and their potential implications for nutrition program planning and design in Ethiopia. We used a large sample from SMART surveys conducted in eight regions of Ethiopia. Our findings indicate that the optimal MUAC threshold for children 6–59 months corresponding to WHZ < –2 was 139 mm, although it varied by age group and region. Younger children (<24 months) had a lower optimal cutoff (128 mm), whereas older children required a higher threshold (140 mm). Regionally, the thresholds ranged from 126.5 mm in Tigray to 143.5 mm in Gambella, indicating substantial geographic variation. We also observed discrepancies in GAM prevalence by case definition: nationally, WHZ produced a higher estimate (12.9%) than MUAC (8.5%), with an average inflation factor of 1.31. However, this gap was not consistent; MUAC performed comparably or better in Tigray (inflation factor 1.31), but WHZ identified many more cases in Somali (inflation factor 0.18).

In surveys conducted in low- and middle-income countries (LMIC) and emergency settings, the correlation between MUAC and WHZ was moderate. For instance, Bilukha and Leidman analyzed 733 humanitarian surveys and identified a Spearman’s ρ of approximately 0.55 (unadjusted R² = 0.36) when comparing the population prevalence of GAM as determined by WHZ and MUAC [20]. Similarly, Leidman et al. aggregated data from 882 representative surveys and reported a Pearson correlation coefficient (r) of approximately 0.49 (R² = 0.24) between individual MUAC and WHZ measurements in children aged 6–59 months [23]. These findings are consistent with the regional R² values observed in Ethiopia, which ranged from approximately 0.2 to 0.4. In essence, while a higher MUAC generally corresponds to a higher WHZ, there are numerous instances in which children are classified differently by each metric.

The patterns observed across regions were consistent with those of other studies. Bilukha et al. found that the MUAC and WHZ correlation was the strongest in the Middle East/North Africa and the weakest in Eastern/Southern Africa [20]. This aligns with the lowest correlations found in Somali/Gambella (Eastern Africa). Other African surveys have demonstrated similar moderate associations: a survey analysis in Mozambique reported a Spearman rank correlation coefficient of approximately 0.59 between individual MUAC and WHZ scores [23], comparable to the mid-range R² of the regions. Similarly, sub-Saharan-wide analyses revealed only moderate correlation; even adjusted models of country-level data showed R² = 0.43–0.50 [20]. In summary, global LMIC data confirmed a positive but not strong MUAC–WHZ link, which is consistent with our Ethiopian findings.

In our survey analysis, the GAM by WHZ was higher at 12.9% than 8.5% by MUAC, with an inflation factor of 0.66. This aligns with the general observation that WHZ often results in higher wasting rates than MUAC [20]. For instance, a recent survey in Ethiopia’s Amhara Region reported a GAM of 13.2% by WHZ versus 8.6% by MUAC [31], reflecting the national trend. However, regional patterns show considerable variations. Pastoral and agro-pastoral regions, such as Somali, Gambella, Afar, and South Ethiopia, exhibited extreme inflation factors, with WHZ-based GAM rates far exceeding those of the MUAC. In contrast, highland agrarian regions, such as Tigray, Amhara, Oromia, and Benishangul, displayed much smaller differences or even slightly higher rates of MUAC. These variations underscore that the choice between MUAC and WHZ can significantly alter caseload estimates in certain areas, particularly pastoralist zones. The observed positive association between MUAC and WHZ across all regions aligns with the established understanding that both indicators reflect aspects of acute malnutrition. However, the variation in the strength of this association across regions underscores the complexity of nutritional assessment and the influence of contextual factors.

Analysis of 773 crisis surveys from humanitarian settings in 41 countries revealed a median GAM rate of 10.47% when assessed using WHZ, compared to 6.66% when measured using MUAC. In 74% of these surveys, the prevalence determined by WHZ exceeded that determined by MUAC [20]. This supports our broader conclusion that WHZ estimates are higher than MUAC. Numerous studies have highlighted significant discrepancies. For instance, in Cambodia, a GAM of 10.6% was reported by WHZ, whereas MUAC reported only 3.3% [32]. Similarly, a recent study in Somalia found a dramatic difference in wasting prevalence between the two measurements: 1.5% by MUAC versus 14.8% by WHZ [33]. Emergency surveillance data from Somalia also revealed marked inconsistencies between these two indicators [34].Conversely, a study in southern Ethiopia found a 5.4% GAM by WHZ versus 10.5% by MUAC [35], while another study in Bangladesh found GAM prevalence by WHZ was 17.1% and by MUAC was 22.5% [36].

Several factors explain the variation in the MUAC vs WHZ relationship. Demographically, MUAC captures younger, smaller, and more stunted children (especially girls), whereas WHZ flags older and taller children [37]. These patterns can strengthen the MUAC–WHZ relationship: in regions with a higher proportion of young or stunted children (or more girls), WHZ tends to rise more rapidly with MUAC. This aligns with our findings (Table 2), in which northern regions with higher stunting levels showed stronger correlations. WHZ is highly sensitive to body proportions (leg length vs. trunk length). Children in pastoralist or lowland groups often have longer legs (lower sitting-to-standing height ratios) than the global reference; this lowers their WHZ for a given weight and inflates the GAM by WHZ. In agrarian highland populations (higher sitting ratios), WHZ and MUAC tended to agree more [38]. Similarly, fat distribution matters: regions with the “thin-fat” phenotype (high central fat) may see WHZ affected differently than MUAC measurements [38,39]. In general, differences in body shape, stunting, and limb proportions drive much of the WHZ and MUAC gaps.

Relying on MUAC alone risks missing many at-risk children and underestimating the prevalence of wasting, thereby increasing mortality from undiagnosed cases [17,26,27]. Analysis of data from 48 countries estimated that a minimum of 300,000 annual deaths could occur among children excluded by MUAC-only screening [40]. Another multi-country pooled analysis demonstrated that MUAC and WHZ identify different child populations, with similar mortality risks for children missed by either measurement [41].

Our analysis also showed that higher MUAC thresholds detected more cases but also yielded more false positives. Increasing the MUAC to 130–136 mm doubled the sensitivity (49–67%) and the specificity was above 75%. Very high cutoffs (139–145 mm) gave sensitivities of 76–89%, but specificity dropped (47–67%), meaning most positives were not true wasted cases, although non-wasted children were correctly identified. Our findings are consistent with those of other studies that showed that the WHO MUAC 125 mm threshold has high specificity but low sensitivity [42,43]. Numerous studies suggest that raising the cutoff would identify many more children [36,44,45]. These studies consistently show that higher MUAC cutoffs would align better with WHZ cutoffs.

To further unpack these trade-offs, we examined age-specific patterns and regional differences in the misclassification errors. We observed age-related patterns in MUAC measurements. Children <24 months: Low MUAC cutoffs (120–124 mm) showed moderate specificity (0.92-0.97) but sensitivity under 50%. Raising MUAC to 130–132 mm increased sensitivity to 83% with a specificity of 70–75%. Peak Youden index was 127–133 mm (=0.53-0.55) for <2y children, suggesting 125 mm might be too strict for infants. Studies recommend a slightly higher MUAC for young children to detect more cases [10,36]. For children ≥24 months, sensitivity at 120–124 mm was low (6–15%) despite 0.99 specificity, missing most older wasted children. Higher MUAC improved sensitivity (52% at 136 mm, 83% at 145 mm) but reduced specificity to 0.57-0.64. Youden index peaked at 135–140 mm (=0.35-0.43). A single MUAC cutoff is inadequate for older children. Studies have confirmed lower MUAC sensitivity in older age groups [10,46]. Our findings support raising the MUAC cutoff for community screening, particularly for older children, to improve the sensitivity. This must be balanced against the lower PPV and resource needs.

Similar trade-offs were observed at the population level in different regions. Using MUAC <125 mm as a screening threshold keeps false positives very low (1–13% across regions); however, it misses the majority of children with acute malnutrition, more than two out of every three cases nationally (66%), and up to nearly 9 in 10 in Somali (88%). Raising the cutoff to MUAC <130 mm reduces the proportion of missed cases, with approximately half of malnourished children still not identified nationally (52%), although this comes at the cost of more false positives (up to 28% in Tigray). At MUAC <140 mm, the number of missed cases drops significantly to approximately one in five nationally (22%), meaning that most malnourished children are detected. However, the trade-off is a sharp increase in false positives, with over one in three healthy children misclassified (approximately 36% nationally and up to 62% in Tigray), which would place a heavy burden on treatment programs.

MUAC cutoffs can vary across populations, with age, gender, and geographic region influencing optimal diagnostic thresholds. Multiple studies show that a single, standardized MUAC cutoff fails to identify malnutrition accurately across contexts [10]. A study in Ethiopia revealed that optimal MUAC cutoffs ranged from 13.75 cm to 13.85 cm across ethnic regions, with sensitivity varying dramatically [47]. Similarly, another study conducted in Philippines found that while gender did not impact cutoffs, age substantially influenced optimal thresholds [17]. These variations underscore the need for population-specific nutritional screening approaches that account for local body composition, environmental conditions, and demographic characteristics to maintain diagnostic accuracy.

This study had several strengths and limitations. The dataset, drawn from the SMART+ aggregator, was based on large sample sizes and demonstrated good to excellent quality, supported by real-time data quality checks during collection. While data were available from eight regions, the study was not nationally representative. In addition, the use of SMART flags, which exclude measurements deemed statistically implausible, may have resulted in the omission of a small number of biologically plausible cases. However, sensitivity analysis showed that the impact of these exclusions on GAM prevalence by WHZ was minimal (12.9% without flags vs. 12.5% with flags), and the overall conclusions remain unchanged. Finally, although age and sex were included in our models, the datasets lacked other important factors that could influence the relationship between WHZ and MUAC.

Implications for practice and policy

Given their partial overlap, MUAC and WHZ should be used together rather than as substitutes, especially in regions with low concordance [48]. This also supports the recommendation to consider region-specific MUAC cutoffs to better identify children missed by standard thresholds and ensure timely intervention for acute malnutrition. However, the feasibility of implementing such regional variations remains uncertain. Studies have also advocated for adjusting MUAC thresholds based on age, sex, and regional body composition norms to improve the diagnostic accuracy of some surveys [49]. Finally, nutrition programs and assessments should consider the local epidemiological context and measurement reliability when selecting screening tools for program design and setup. Second, the higher detection rate of MAM and SAM using WHZ suggests that relying solely on MUAC may underestimate the true burden of acute malnutrition. This has program and planning implications: using WHZ may increase caseloads and resource needs but ensures broader coverage. However, the WHO recommends the harmonized use of both indicators to avoid the exclusion of vulnerable children [50].

Conclusion

Our analysis demonstrates that the current MUAC threshold of <125 mm misses the majority of malnourished children, particularly in Somali and Gambella, thus limiting its utility for screening at the community level. Cutoff selection is ultimately a policy decision that balances sensitivity (identifying at-risk children) and specificity (program capacity and burden). Although Somali and Gambella exhibited very high false-negative rates at the standard MUAC threshold, this does not imply that MUAC lacks programmatic value. MUAC remains an effective tool for identifying many severely malnourished children and is widely used for community-level screening due to its simplicity, low cost, and feasibility. The observed variation highlights the need for context-specific strategies and complementary approaches, rather than replacing MUAC. In these regions, relying solely on MUAC risks missing most wasted children, and programs may need to incorporate WHZ or combined criteria for case identification. Therefore, we recommend planning and resource allocation based on combined prevalence (WHZ < –2, MUAC < 12mm), particularly in Somali and Gambella, to ensure that all at-risk children are identified and supported in a timely manner.

Bibliography49

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Mertens A, Benjamin-Chung J, Colford JM Jr, Hubbard AE, van der Laan MJ, Coyle J, et al. Child wasting and concurrent stunting in low- and middle-income countries. Nature. 2023;621(7979):558–67. doi: 10.1038/s 41586-023-06480-z 37704720 PMC 10511327 · doi ↗ · pubmed ↗
2Organization, W.H. Levels and trends in child malnutrition: UNICEF/WHO/World Bank Group Joint Child Malnutrition Estimates: Key findings of the 2023 edition. World Health Organization; 2023.
3Black RE, Victora CG, Walker SP, Bhutta ZA, Christian P, de Onis M, et al. Maternal and child undernutrition and overweight in low-income and middle-income countries. Lancet. 2013;382(9890):427–51. doi: 10.1016/S 0140-6736(13)60937-X 23746772 · doi ↗ · pubmed ↗
4Mc Donald CM, Olofin I, Flaxman S, Fawzi WW, Spiegelman D, Caulfield LE, et al. The effect of multiple anthropometric deficits on child mortality: meta-analysis of individual data in 10 prospective studies from developing countries. Am J Clin Nutr. 2013;97(4):896–901. doi: 10.3945/ajcn.112.047639 23426036 · doi ↗ · pubmed ↗
5FAO, U., UNICEF, WFP, and WHO. Global action plan on child wasting: a framework for action to accelerate progress in preventing and managing child wasting and the achievement of the Sustainable Development Goals. Food and Agriculture Organization, United Nations High Commissioner for Refugees, United Nations Children’s Fund, World Food Programme, World Health Organization; 2020.
6EPHI, Ethiopian Public Health Institute (EPHI) [Ethiopia] and ICF. 2019 Ethiopia Mini Demographic and Health Survey Final Report. Rockville (MD): EPHI and ICF; 2019.
7Schoonees A, Lombard MJ, Musekiwa A, Nel E, Volmink J. Ready-to-use therapeutic food (RUTF) for home-based nutritional rehabilitation of severe acute malnutrition in children from six months to five years of age. Cochrane Database Syst Rev. 2019;5(5):CD 009000. doi: 10.1002/14651858.CD 009000.pub 3 31090070 PMC 6537457 · doi ↗ · pubmed ↗
8WHO. Guideline on the management of wasting in children. World Health Organization; 2020.