# Interobserver agreement: Individual CTG features show better agreement among investigators than the overall CTG assessment in cases of meconium-stained amniotic fluid

**Authors:** Linas Rovas, Meile Minkauskiene, Kristina Berskiene, Vaiva Maciulionyte, Akvile Papievyte, Ruta Petkeviciute, Augusta Petrusaite, Agne Pinauskaite

PMC · DOI: 10.18332/ejm/215682 · 2025-12-31

## TL;DR

This study found that specific CTG features have better agreement among experts than overall CTG assessments when amniotic fluid is meconium-stained.

## Contribution

The study demonstrates that individual CTG features are more reliably assessed than overall CTG categorization.

## Key findings

- Baseline rate, variability, and deceleration presence showed moderate to very good interobserver agreement.
- Overall CTG categorization had poor to moderate agreement among clinicians.
- Objective CTG features may be more reliable than categorical assessments for clinical use.

## Abstract

The objective of this investigation was to evaluate the interobserver agreement between different investigators on selected cardiotocogram (CTG) parameters.

Medical records were selected from birth histories of cephalic deliveries with meconium-stained amniotic fluid. A total of 84 CTGs were recorded and analyzed by six clinicians. Agreement metrics such as proportion of agreement (Pa) with corresponding 95% confidence intervals (95% CIs) and reliability indices calculated via the Fleiss kappa statistic, were employed to quantify interobserver consistency.

CTG parameters baseline rate, variability, presence or absence of decelerations, and total time of decelerations demonstrated good or moderate interobserver agreement, kappa ranged 0.47–0.80, indicating fairly high consistency in estimating these parameters. The kappa coefficients for these features ranged from moderate to very good levels. The assessment of accelerations exhibited only weak to moderate concordance (kappa: 0.29–0.47). Evaluation of the deceleration type yielded the lowest agreement. The overall categorization of CTGs into categories exhibited only poor to moderate interobserver concordance (Fleiss kappa: 0.19–0.44).

CTG parameters – baseline rate, variability, presence/absence of decelerations, and total width of decelerations in a 30-minute CTG interval – are features that can be interpreted with a high degree of objectivity and agreement with appropriate training, even without clinical experience. Since the categorization of CTGs into separate categories (normal, suspicious, and pathological) has a poor to moderate level of agreement, it indicates a need for discussion on whether it is worth continuing to rely on such CTG categorical stratification or base CTG judgements on more objective and high agreement parameters.

## Full-text entities

- **Diseases:** tachycardia (MESH:D013610), hypoxic injury (MESH:D002534), acidemia (MESH:C537358), bradycardia (MESH:D001919), fetal hypoxia (MESH:D005311), fetal anomalies (MESH:D000013), PA (MESH:C535387), acidosis (MESH:D000138)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** KB — Homo sapiens (Human), Human papillomavirus-related endocervical adenocarcinoma, Cancer cell line (CVCL_0372)

---
Source: https://tomesphere.com/paper/PMC12810198