Matching Known Patients to Health Records in Washington State Data
Latanya Sweeney

TL;DR
This paper demonstrates that publicly available health data and news reports can be combined to accurately identify individual patient records, raising privacy concerns and highlighting potential risks of re-identification.
Contribution
It introduces a method for matching anonymized health records with news reports, revealing privacy vulnerabilities in publicly available health data.
Findings
43% of cases matched with news reports verified by patients
News stories can uniquely identify patients in health datasets
Potential privacy risks for individuals in publicly available health data
Abstract
The State of Washington sells patient-level health data for $50. This publicly available dataset has virtually all hospitalizations occurring in the State in a given year, including patient demographics, diagnoses, procedures, attending physician, hospital, a summary of charges, and how the bill was paid. It does not contain patient names or addresses (only ZIPs). Newspaper stories printed in the State for the same year that contain the word "hospitalized" often include a patient's name and residential information and explain why the person was hospitalized, such as vehicle accident or assault. News information uniquely and exactly matched medical records in the State database for 35 of the 81 cases (or 43 percent) found in 2011, thereby putting names to patient records. A news reporter verified matches by contacting patients. Employers, financial organizations and others know the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
