Why patient data cannot be easily forgotten?
Ruolin Su, Xiao Liu, Sotirios A. Tsaftaris

TL;DR
This paper investigates the challenges of forgetting patient data from AI models, showing it is difficult to do so and proposing a targeted forgetting method that improves performance on medical imaging data.
Contribution
It introduces a novel targeted forgetting approach for patient data in AI models, addressing the under-explored problem of data removal in medical AI.
Findings
Forgetting patient data is inherently difficult due to data influence on models.
The proposed targeted forgetting method outperforms existing state-of-the-art techniques.
Experiments on cardiac diagnosis data demonstrate improved forgetting performance.
Abstract
Rights provisioned within data protection regulations, permit patients to request that knowledge about their information be eliminated by data holders. With the advent of AI learned on data, one can imagine that such rights can extent to requests for forgetting knowledge of patient's data within AI models. However, forgetting patients' imaging data from AI models, is still an under-explored problem. In this paper, we study the influence of patient data on model performance and formulate two hypotheses for a patient's data: either they are common and similar to other patients or form edge cases, i.e. unique and rare cases. We show that it is not possible to easily forget patient data. We propose a targeted forgetting approach to perform patient-wise forgetting. Extensive experiments on the benchmark Automated Cardiac Diagnosis Challenge dataset showcase the improved performance of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Privacy-Preserving Technologies in Data
