# Methods for Analytical Validation of Novel Digital Clinical Measures: Implementation Feasibility Evaluation Using Real-World Datasets

**Authors:** Simon Turner, Lysbeth Floden, Leif Simmatis, Piper Fromy, Joss Langford, Eric J Daza, Andrew Potter, Kathleen Troeger

PMC · DOI: 10.2196/70314 · Journal of Medical Internet Research · 2025-11-17

## TL;DR

This paper evaluates statistical methods for validating new digital health measures using real-world data, showing that confirmatory factor analysis is effective and design factors influence results.

## Contribution

The paper introduces a standardized approach for validating novel digital measures using real-world datasets and demonstrates the feasibility of confirmatory factor analysis in this context.

## Key findings

- Confirmatory factor analysis models showed acceptable fit and stronger correlations in studies with high temporal and construct coherence.
- The performance of statistical methods supports their feasibility in real-world data for validating novel digital measures.
- Study design factors significantly impact the estimated relationships between digital measures and reference measures.

## Abstract

Sensor-based digital health technologies (sDHTs) are increasingly used to support scientific and clinical decision-making. The digital measures (DMs) they generate offer significant potential to accelerate the drug development timeline, decrease clinical trial costs, and improve access to care. However, choosing an appropriate statistical methodology when conducting analytical validation (AV) of a DM is complicated, particularly for novel DMs, for which appropriate, established reference measures (RMs) may not exist. More understanding of, and a standardized approach to, AV in these scenarios is needed.

In a prior simulation study, 3 statistical methods were tested for their ability to estimate a simulated relationship between a sDHT-derived DM and several clinical outcome assessment (COA) RMs. The aim of this work was to assess the feasibility of implementation of these methods in real data and to examine the impact of AV study design factors on the relationships estimated.

Four real-world datasets, captured using sDHTs, were used to prepare hypothetical AV studies representing a range of scenarios with respect to 3 key study design properties: temporal coherence, construct coherence, and data completeness. The datasets analyzed were as follows: Urban Poor (comparing nighttime awakenings to measures of psychological well-being), STAGES (comparing daily step count to psychological and fatigue measures), mPower (comparing daily smartphone screen taps to measures of function in Parkinson’s disease), and Brighten (comparing smartphone communication activity to measures of psychological well-being). For each hypothetical AV study, 3 statistical methods were leveraged: the Pearson correlation coefficient (PCC) between DM and RM, simple linear regression (SLR) between DM and RM, multiple linear regression (MLR) between DMs and combinations of RMs, and 2-factor, correlated-factor confirmatory factor analysis (CFA) models. Performance measures were the PCC magnitudes (for PCC), R2 and adjusted R2 statistics (for SLR and MLR, respectively), and factor correlations (for CFA).

Most of the CFA models exhibited an acceptable fit according to the majority of the fit statistics employed, and each model was able to estimate a factor correlation. For each model, these correlations were greater than or equal to the corresponding PCC in magnitude. Correlations were the strongest in the hypothetical studies with strong temporal and construct coherence.

The performance of the selected statistical methods shown in this work supports their feasibility when implemented in real-world data. Our findings, in particular, support the use of CFA to assess the relationship between a novel DM and a COA RM. The observed impact of AV study design factors on the relationships estimated allowed the authors to determine practical recommendations for study design in AV of novel DMs. By using a standardized methodology for evaluating novel DMs, sDHT developers, biostatisticians, and clinical researchers can navigate the complex validation landscape more easily, with more certainty, and with more tools at their disposal.

## Linked entities

- **Diseases:** Parkinson’s disease (MONDO:0005180)

## Full-text entities

- **Diseases:** DM (MESH:D009223), Parkinson's disease (MESH:D010300), fatigue (MESH:D005221)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12622859/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/PMC12622859/full.md

---
Source: https://tomesphere.com/paper/PMC12622859