# Addressing common inferential mistakes when failing to reject the null-hypothesis

**Authors:** Amand Schmidt, J. Alexander Heimel, Amand Schmidt, Ying Cui, Amand Schmidt

PMC · DOI: 10.12688/f1000research.158434.1 · 2024-12-05

## TL;DR

This paper explains why failing to reject a null hypothesis doesn't prove no effect and suggests better statistical approaches for medical research.

## Contribution

The paper highlights flaws in common statistical practices and advocates for estimation accuracy and replication over traditional hypothesis testing.

## Key findings

- Traditional statistical tests cannot conclusively show the absence of an association.
- Post-hoc power calculations are misleading and should be avoided.
- Multiplicity corrections often fail to distinguish true from false positives.

## Abstract

Failure to reject a null hypothesis may lead to erroneous conclusions regarding the absence of an association or inadequate statistical power. Because an estimate (and its variance) can never be exactly zero, traditional statistical tests cannot conclusively demonstrate the absence of an association. Instead, estimates of accuracy should be used to identify settings in which an association and its variability are sufficiently small to be clinically acceptable, directly providing information on safety and efficacy. Post-hoc power calculations should be avoided, as they offer no additional information beyond statistical tests and p-values. Furthermore, post-hoc power calculations can be misleading because of an inability to distinguish between results based on insufficient sample size and results that reflect clinically irrelevant differences. Most multiple testing procedures unrealistically assume that all positive results are false positives. However, in applied settings, results typically represent a mix of true and false positives. This implies that multiplicity corrections do not effectively differentiate between true and false positives. Instead, considering the distributions of p-values and the proportion of significant results can help to identify bodies of evidence unlikely to be driven by false-positive results. In conclusion, rather than attempting to categorize results as true or false, medical research should embrace established statistical methods that focus on estimation accuracy, replication, and consistency.

## Full-text entities

- **Genes:** PCSK9 (proprotein convertase subtilisin/kexin type 9) [NCBI Gene 255738] {aka FH3, FHCL3, HCHOLA3, LDLCQ1, NARC-1, NARC1}
- **Diseases:** stroke (MESH:D020521), myocardial infarction (MESH:D009203), PAD (MESH:D058729), acute limb ischemia (MESH:D000208), venous thromboembolism (MESH:D054556), ischemic (MESH:D002545), bleeding (MESH:D006470), cardiovascular disease (MESH:D002318), ischaemic stroke (MESH:D002544)
- **Chemicals:** evolocumab (MESH:C577155), rivaroxaban (MESH:D000069552)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11928781/full.md

---
Source: https://tomesphere.com/paper/PMC11928781