# Beyond Bonferroni: new multiple contrast tests for time-to-event data under non-proportional hazards

**Authors:** Ina Dormuth, Carolin Herrmann, Frank Konietschke, Markus Pauly, Matthias Wirth, Marc Ditzhaus

PMC · DOI: 10.1007/s10985-025-09676-9 · 2026-01-14

## TL;DR

The paper introduces new statistical tests for comparing multiple groups in clinical trials with time-to-event data, which are more powerful than existing methods when hazards are non-proportional.

## Contribution

The paper proposes two new multiple contrast tests for time-to-event data that maintain familywise error rate control without requiring p-value correction.

## Key findings

- The new tests outperform the Bonferroni-corrected log-rank test in non-proportional hazard scenarios.
- The proposed tests control the familywise error rate and maintain reasonable power across various scenarios.
- The CASANOVA-based test is particularly effective under crossing hazard conditions.

## Abstract

When comparing multiple groups in clinical trials, we are not only interested in whether there is a difference between any groups but rather where the difference is. Such research questions lead to testing multiple individual hypotheses. To control the familywise error rate (FWER), we must apply some corrections or introduce tests that control the FWER by design. In the case of time-to-event data, a Bonferroni-corrected log-rank test is commonly used. This approach has two significant drawbacks: (i) it loses power when the proportional hazards assumption is violated and (ii) the correction generally leads to a lower power, especially when the test statistics are not independent. We propose two new tests based on combined weighted log-rank tests. One is a simple multiple contrast test of weighted log-rank tests, and one is an extension of the so-called CASANOVA test. The latter was introduced for factorial designs. We propose a new multiple contrast test based on the CASANOVA approach. Our test shows promise of being more powerful under crossing hazards and eliminates the need for additional p-value correction. We assess the performance of our tests through extensive Monte Carlo simulation studies covering both proportional and non-proportional hazard scenarios. Finally, we apply the new and reference methods to a real-world data example. The new approaches control the FWER and show reasonable power in all scenarios. They outperform the adjusted approaches in some non-proportional settings in terms of power.

The online version contains supplementary material available at 10.1007/s10985-025-09676-9.

## Full-text entities

- **Diseases:** MM (MESH:D009101)
- **Chemicals:** mdir (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12804333/full.md

---
Source: https://tomesphere.com/paper/PMC12804333