# Upstrapping to determine futility: predicting future outcomes nonparametrically from past data

**Authors:** Jessica L. Wild, Adit A. Ginde, Christopher J. Lindsell, Alexander M. Kaizer

PMC · DOI: 10.1186/s13063-024-08136-3 · Trials · 2024-05-09

## TL;DR

This paper introduces upstrapping, a nonparametric method for interim futility monitoring in clinical trials, showing it can be calibrated to balance sample size, power, and error rates.

## Contribution

The paper proposes and evaluates upstrapping as a novel nonparametric method for futility monitoring in clinical trials.

## Key findings

- Upstrapping is more likely to detect futility in null scenarios across various simulation settings.
- Type I error rates of upstrapping differ by at most 1.7% compared to O’Brien-Fleming methods.
- Upstrapped approaches reduce expected sample size by 2–22% in null scenarios.

## Abstract

Clinical trials often involve some form of interim monitoring to determine futility before planned trial completion. While many options for interim monitoring exist (e.g., alpha-spending, conditional power), nonparametric based interim monitoring methods are also needed to account for more complex trial designs and analyses. The upstrap is one recently proposed nonparametric method that may be applied for interim monitoring.

Upstrapping is motivated by the case resampling bootstrap and involves repeatedly sampling with replacement from the interim data to simulate thousands of fully enrolled trials. The p-value is calculated for each upstrapped trial and the proportion of upstrapped trials for which the p-value criteria are met is compared with a pre-specified decision threshold. To evaluate the potential utility for upstrapping as a form of interim futility monitoring, we conducted a simulation study considering different sample sizes with several different proposed calibration strategies for the upstrap. We first compared trial rejection rates across a selection of threshold combinations to validate the upstrapping method. Then, we applied upstrapping methods to simulated clinical trial data, directly comparing their performance with more traditional alpha-spending and conditional power interim monitoring methods for futility.

The method validation demonstrated that upstrapping is much more likely to find evidence of futility in the null scenario than the alternative across a variety of simulations settings. Our three proposed approaches for calibration of the upstrap had different strengths depending on the stopping rules used. Compared to O’Brien-Fleming group sequential methods, upstrapped approaches had type I error rates that differed by at most 1.7% and expected sample size was 2–22% lower in the null scenario, while in the alternative scenario power fluctuated between 15.7% lower and 0.2% higher and expected sample size was 0–15% lower.

In this proof-of-concept simulation study, we evaluated the potential for upstrapping as a resampling-based method for futility monitoring in clinical trials. The trade-offs in expected sample size, power, and type I error rate control indicate that the upstrap can be calibrated to implement futility monitoring with varying degrees of aggressiveness and that performance similarities can be identified relative to considered alpha-spending and conditional power futility monitoring methods.

The online version contains supplementary material available at 10.1186/s13063-024-08136-3.

## Full-text entities

- **Diseases:** aggressiveness (MESH:D010554)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11083808/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11083808/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/PMC11083808/full.md

---
Source: https://tomesphere.com/paper/PMC11083808