When Algorithms Infer Gender: Revisiting Computational Phenotyping with Electronic Health Records Data
Jessica Gronsbell, Hilary Thurston, Lillian Dong, Vanessa Ferguson, Diksha Sen Chaudhury, Braden O'Neill, Katrina S. Sha, Rebecca Bonneville

TL;DR
This paper critically reviews the use of algorithms to infer gender from electronic health records, discussing methodological and ethical challenges, and proposing future research priorities to improve practices and address concerns.
Contribution
It provides a comprehensive review of current gender inference methods in EHRs, highlighting challenges and ethical issues, and suggests directions for future research.
Findings
Current practices often rely on diagnosis codes and clinical notes.
Methodological and ethical concerns are significant in gender inference.
Recommendations for improving research practices are proposed.
Abstract
Computational phenotyping has emerged as a practical solution to the incomplete collection of data on gender in electronic health records (EHRs). This approach relies on algorithms to infer a patient's gender using the available data in their health record, such as diagnosis codes, medication histories, and information in clinical notes. Although intended to improve the visibility of trans and gender-expansive populations in EHR-based biomedical research, computational phenotyping raises significant methodological and ethical concerns related to the potential misuse of algorithm outputs. In this paper, we review current practices for computational phenotyping of gender and examine its challenges through a critical lens. We also highlight existing recommendations for biomedical researchers and propose priorities for future work in this domain.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
