# Reliability of subjective evaluation in assessing elite table tennis players’ performance

**Authors:** Lulu Gan, Jing Chen, Luning Wang, Yunfei Lu, Jie Ren

PMC · DOI: 10.3389/fpsyg.2025.1432711 · 2025-06-30

## TL;DR

This study examines how reliable subjective evaluations of elite table tennis players are, finding that observer expertise and information availability significantly affect evaluation consistency.

## Contribution

The study introduces a novel analysis of how observer skill and information conditions influence the reliability of subjective performance assessments in table tennis.

## Key findings

- Intra-observer reliability was good (r = 0.61–0.86), but inter-observer consistency was low (k = 0.01–0.39).
- Expert observers showed higher consistency than advanced and novice observers across all indicators.
- Occluding kinematic information reduced evaluation consistency, especially for tactical behavior.

## Abstract

This study aims to assess the reliability of subjective evaluations conducted under two information conditions and to explore the influence of observer expertise on the consistency of performance assessments of elite table tennis players.

Observers of varying skill levels were invited to provide subjective evaluations of the elite table tennis players’ performance by observing specific rally strokes during the match. A Video Masking Paradigm approach was implemented to conceal motion information during critical moments of scoring and losing. The weighted Kappa coefficient (k) was employed to evaluate the inter-observer consistency between two observers. The Kendall’s coefficient of concordance (w) is a measure of inter-rater agreement, specifically used for ordinal scales (e.g., Likert five-point scale) when multiple raters are involved.

Intra-observer reliability was good (r = 0.61–0.86), whereas inter-observer consistency between the two observers was low (k = 0.01–0.39). Among the observation indicators, the advanced group showed the lowest consistency in evaluating tactical behavior (without results, w = 0.44; with results, w = 0.76). Experiment 2: The consistency of the observers in the without results condition (expert group w = 0.75 vs. advanced group w = 0.57 vs. novice group w = 0.66) is lower than in the with results (expert group w = 0.84 vs. advanced group w = 0.78 vs. novice group w = 0.76). Across all three observation indicators, namely stroke quality, tactical intention, and competitive posture, the expert group demonstrated the highest level of consistency, followed by the advanced group, while the novice group exhibited the lowest level of agreement.

Observers with table tennis skill levels demonstrate high intra-observer test–retest reliability in subjective evaluations, but the inter-observer consistency is lower. Different information conditions (with or without results) are key variables affecting the consistency of subjective evaluations. When kinematic information is occluded (without results), the consistency of subjective evaluations decreases. The selection of observation indicators also impacts the consistency of subjective evaluations. Additionally, observers’ consistency in subjective evaluations is influenced by their level of experience and skill: the higher the observer’s level and experience, the greater the consistency of their subjective evaluations.

## Full-text entities

- **Diseases:** stroke (MESH:D020521)

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12257951/full.md

---
Source: https://tomesphere.com/paper/PMC12257951