# High‐Dimensional Propensity Scores for Mitigating Confounding: Implementation Using Primary and Secondary Care Data in Hong Kong

**Authors:** Edmund C. L. Cheung, Min Fan, Celine S. L. Chui, Angel Y. S. Wong, John Tazare

PMC · DOI: 10.1002/pds.70326 · Pharmacoepidemiology and Drug Safety · 2026-01-25

## TL;DR

This study shows how a new method called HDPS can reduce bias in healthcare data studies, using Hong Kong data to better compare antihypertensive drugs and dementia risk.

## Contribution

The study implements HDPS in Hong Kong healthcare data and develops an R package for its application, improving covariate balance and identifying influential confounders.

## Key findings

- HDPS improved covariate balance and identified potential frailty markers as influential confounders.
- After HDPS, beta-blockers showed moderate evidence of reduced dementia risk compared to ACE inhibitors.
- An R package was developed to facilitate HDPS implementation in other healthcare databases.

## Abstract

Confounding is a key concern in observational studies using healthcare databases. The high‐dimensional propensity score (HDPS) algorithm is an approach for generating and prioritising proxy variables, leveraging all available information in a database to mitigate residual confounding. This study aims to implement HDPS approaches in a novel setting using primary and secondary data available from Hong Kong (HK).

Using data from HK, we implemented HDPS in a cohort study investigating the use of different antihypertensive drug classes and incident dementia risk. The top 250 HDPS covariates were included in inverse probability of treatment weighting in addition to investigator‐specified variables. Diagnostics evaluated the performance of the HDPS. Sensitivity analyses included varying the number of HDPS covariates and removing potentially influential or inappropriate covariates.

434 506 new‐users of antihypertensives were included. With a traditional PS approach, no evidence for an association was observed for each antihypertensive comparison. After HDPS implementation, the estimate for beta‐blockers shifted from no evidence (Hazard ratio (HR): 0.93, 95% confidence interval (CI): 0.86–1.02) to moderate evidence of a reduced hazard of incident dementia compared to angiotensin‐converting enzyme inhibitors (HR: 0.90, 95% CI: 0.82–0.98). A greater overall covariate balance between comparison groups was achieved after the inclusion of HDPS covariates and potential frailty markers were identified as influential.

We successfully implemented the HDPS in HK data, observing improved covariate balance across a wider set of potential confounders. HDPS also identified possible database‐specific frailty markers which could be considered more widely when specifying adjustment variables in this setting.

The aim of the study was to implement the high‐dimensional propensity score (HDPS) algorithm, a method for mitigating bias due to confounding in observational studies, in a novel setting using primary and secondary data from Hong Kong.We compared the results of the case study before and after application of HDPS, evaluated the performance using diagnostics, and developed an R package for HDPS implementation in any database.HDPS analyses are encouraged in future observational studies using electronic healthcare databases to mitigate confounding and identify potentially influential covariates that may be neglected when using clinical or epidemiological knowledge solely for variable selection.

The aim of the study was to implement the high‐dimensional propensity score (HDPS) algorithm, a method for mitigating bias due to confounding in observational studies, in a novel setting using primary and secondary data from Hong Kong.

We compared the results of the case study before and after application of HDPS, evaluated the performance using diagnostics, and developed an R package for HDPS implementation in any database.

HDPS analyses are encouraged in future observational studies using electronic healthcare databases to mitigate confounding and identify potentially influential covariates that may be neglected when using clinical or epidemiological knowledge solely for variable selection.

Confounding is a key concern in observational studies using healthcare databases. The HDPS is a method that can leverage all available information in a database to mitigate residual confounding. This study aims to implement HDPS approaches in a novel setting using primary and secondary data available from Hong Kong (HK). Using data from HK, we applied HDPS in a study comparing the use of different antihypertensive drug classes and dementia risk. We also used diagnostics to evaluate the performance of the HDPS. In the results with a traditional propensity score approach, no evidence for an association was observed for each antihypertensive comparison. After HDPS implementation, the estimate for beta‐blockers shifted from no evidence to moderate evidence of a reduced risk of dementia compared to angiotensin‐converting enzyme inhibitors. A greater balance of variables between comparison groups was achieved with HDPS, indicating improved performance compared with the traditional propensity score approach, and additional markers with potential confounding were identified as influential. We also developed a statistical package to ease its implementation in future studies. In conclusion, we successfully implemented the HDPS in HK data and encourage its uptake in observational studies using healthcare data.

## Linked entities

- **Diseases:** dementia (MONDO:0001627)

## Full-text entities

- **Genes:** ALB (albumin) [NCBI Gene 213] {aka FDAHT, HSA, PRO0883, PRO0903, PRO1341}, ACE (angiotensin I converting enzyme) [NCBI Gene 1636] {aka ACE1, CD143, DCP, DCP1}
- **Diseases:** hypertension (MESH:D006973), frailty (MESH:D000073496), stroke (MESH:D020521), malignant neoplasm of the pancreas (MESH:D010190), HDPS (MESH:C563324), intracranial haemorrhage (MESH:D013345), skull fractures (MESH:D012887), type 2 diabetes (MESH:D003924), dementia (MESH:D003704), death (MESH:D003643), HA (MESH:D003428), protein-calorie malnutrition (MESH:D011502), injuries to the nervous system (MESH:D020196), diabetes (MESH:D003920), fracture of the base of skull (MESH:D019292), occlusion of cerebral arteries (MESH:D001157), myocardial infarction (MESH:D009203)
- **Chemicals:** sugar (MESH:D000073893), sulphonylureas (MESH:D013453), Calcium (MESH:D002118), ACEIs (-), cholesterol (MESH:D002784)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** D24H

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12833473/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12833473/full.md

---
Source: https://tomesphere.com/paper/PMC12833473