Grandma Karl is 27 years old -- research agenda for pseudonymization of research data
Elena Volodina (University of Gothenburg), Simon Dobnik (University of, Gothenburg), Therese Lindstr\"om Tiedemann (University of Helsinki), Xuan-Son, Vu (Ume{\aa} university)

TL;DR
This paper proposes a research agenda to explore pseudonymization techniques for unstructured research data, focusing on balancing data accessibility with privacy protection under GDPR.
Contribution
It outlines a comprehensive research agenda for pseudonymization, emphasizing the need for studies on its effects and development of context-sensitive algorithms.
Findings
Identifies key challenges in pseudonymization of unstructured data.
Proposes research directions for improving pseudonymization techniques.
Highlights the importance of balancing data utility and privacy.
Abstract
Accessibility of research data is critical for advances in many research fields, but textual data often cannot be shared due to the personal and sensitive information which it contains, e.g names or political opinions. General Data Protection Regulation (GDPR) suggests pseudonymization as a solution to secure open access to research data, but we need to learn more about pseudonymization as an approach before adopting it for manipulation of research data. This paper outlines a research agenda within pseudonymization, namely need of studies into the effects of pseudonymization on unstructured data in relation to e.g. readability and language assessment, as well as the effectiveness of pseudonymization as a way of protecting writer identity, while also exploring different ways of developing context-sensitive algorithms for detection, labelling and replacement of personal information in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
