# Evaluating inter-rater reliability in the context of “Sysmex UN2000 detection of protein/creatinine ratio and of renal tubular epithelial cells can be used for screening lupus nephritis”: a statistical examination

**Authors:** Ming Li, Qian Gao, Jing Yang, Tianfei Yu

PMC · DOI: 10.1186/s12882-024-03540-y · BMC Nephrology · 2024-03-13

## TL;DR

This paper examines the statistical methods used to evaluate agreement between raters in a study about detecting lupus nephritis using lab equipment.

## Contribution

The paper introduces an alternative statistical approach for assessing inter-rater reliability and emphasizes proper statistical reporting.

## Key findings

- Chen et al.'s statistical approach did not change their findings but may have underestimated agreement.
- Researchers should carefully choose appropriate Kappa statistics based on variable types.
- Proper computation and reporting of inter-rater reliability is crucial for accurate hypothesis testing.

## Abstract

The evaluation of inter-rater reliability (IRR) is integral to research designs involving the assessment of observational ratings by two raters. However, existing literature is often heterogeneous in reporting statistical procedures and the evaluation of IRR, although such information can impact subsequent hypothesis testing analyses.

This paper evaluates a recent publication by Chen et al., featured in BMC Nephrology, aiming to introduce an alternative statistical approach to assessing IRR and discuss its statistical properties. The study underscores the crucial need for selecting appropriate Kappa statistics, emphasizing the accurate computation, interpretation, and reporting of commonly used IRR statistics between two raters.

The Cohen’s Kappa statistic is typically used for two raters dealing with two categories or for unordered categorical variables having three or more categories. On the other hand, when assessing the concordance between two raters for ordered categorical variables with three or more categories, the commonly employed measure is the weighted Kappa.

Chen and colleagues might have underestimated the agreement between AU5800 and UN2000. Although the statistical approach adopted in Chen et al.’s research did not alter their findings, it is important to underscore the importance of researchers being discerning in their choice of statistical techniques to address their specific research inquiries.

## Linked entities

- **Diseases:** lupus nephritis (MONDO:0005556)

## Full-text entities

- **Diseases:** lupus nephritis (MESH:D008181)
- **Chemicals:** creatinine (MESH:D003404)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10938658/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/PMC10938658/full.md

---
Source: https://tomesphere.com/paper/PMC10938658