Development of a defacing algorithm to protect the privacy of head and neck cancer patients in publicly-accessible radiotherapy datasets
Kayla O'Sullivan-Steben, Luc Galarneau, John Kildea

TL;DR
This paper presents a novel automated defacing algorithm for head and neck cancer CT scans that effectively anonymizes patient data while preserving critical anatomical structures, facilitating data sharing for AI research.
Contribution
The authors developed and validated a new defacing method that protects patient privacy without compromising the utility of medical imaging data for analysis.
Findings
Face recognition accuracy dropped from 97% to 4% after defacing.
Autocontouring achieved perfect Dice scores for organs below the defaced region.
Most PTVs were unaffected by the defacing process.
Abstract
Introduction: The rise in public medical imaging datasets has raised concerns about patient reidentification from head CT scans. However, existing defacing algorithms often remove or distort Organs at Risk (OARs) and Planning Target Volumes (PTVs) in head and neck cancer (HNC) patients, and ignore DICOM-RT Structure Set and Dose data. Therefore, we developed and validated a novel automated defacing algorithm that preserves these critical structures while removing identifiable features from HNC CTs and DICOM-RT data. Methods: Eye contours were used as landmarks to automate the removal of CT pixels above the inferior-most eye slice and anterior to the eye midpoint. Pixels within PTVs were retained if they intersected with the removed region. The body contour and dose map were reshaped to reflect the defaced image. We validated our approach on 829 HNC CTs from 622 patients. Privacy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
