Methodological Precedence in Health Tech: Why ML/Big Data Analysis Must Follow Basic Epidemiological Consistency. A Case Study
Marco Roccetti

TL;DR
This paper emphasizes that advanced ML and Big Data health analyses must adhere to fundamental epidemiological principles, as neglecting basic methodological rigor can lead to misleading conclusions, demonstrated through a COVID-19 vaccine adverse event case study.
Contribution
It highlights the necessity of verifying basic epidemiological consistency before applying complex analytical methods in health research, illustrated by a case study exposing methodological flaws.
Findings
Identification of paradoxes in COVID-19 vaccine data due to methodological flaws
Demonstration that advanced analyses can amplify initial design errors
Validation that basic epidemiological checks are essential before complex modeling
Abstract
The integration of advanced analytical tools, including Machine Learning (ML) and massive data processing, has revolutionized health research, promising unprecedented accuracy in diagnosis and risk prediction. However, the rigor of these complex methods is fundamentally dependent on the quality and integrity of the underlying datasets and the validity of their statistical design. We propose an emblematic case where advanced analysis (ML/Big Data) must necessarily be subsequent to the verification of basic methodological coherence and adherence to established medical protocols, such as the STROBE Statement. This study highlights a crucial cautionary principle: sophisticated analyses amplify, rather than correct, severe methodological flaws rooted in basic design choices, leading to misleading or contradictory findings. By applying simple, standard descriptive statistical methods and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Immune responses and vaccinations · COVID-19 epidemiological studies
