Identifying the potential of sample overlap in evidence synthesis of observational studies

Zhentian Zhang; Tim Friede; Tim Mathes

arXiv:2602.21410·stat.ME·February 26, 2026

Identifying the potential of sample overlap in evidence synthesis of observational studies

Zhentian Zhang, Tim Friede, Tim Mathes

PDF

Open Access

TL;DR

This paper presents a set-theoretic method to identify and quantify sample overlap in observational studies, improving the reliability of evidence synthesis without needing individual participant data.

Contribution

The authors introduce a novel, practical set-based approach to detect sample overlap in evidence synthesis, addressing a key challenge in observational research integration.

Findings

01

Effective identification of sample overlap demonstrated on real-world data

02

Method provides overlap-free largest sample set for evidence synthesis

03

Highlights importance of addressing sample overlap in secondary data use

Abstract

Sample overlap is a common issue in evidence synthesis in the field of medical research, particularly when integrating findings from observational studies utilizing existing databases such as registries. Due to the general inaccessibility of unique identifiers for each observation, addressing sample overlap has been a complex problem, potentially biasing evidence synthesis outcomes and undermining their credibility. We developed a method to construct indicators for the degree of sample overlap in evidence synthesis of studies based on existing data. Our method is rooted in set theory and is based on the coding of the ranges of several well selected sample characteristics, offers a practical solution by focusing on making inference based on sample characteristics rather than on individual participant data. Useful information, such as the overlap-free sample set with the largest sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMeta-analysis and systematic reviews · Health Policy Implementation Science · Biomedical Text Mining and Ontologies