# Confidentiality and linked data

**Authors:** Felix Ritchie, Jim Smith

arXiv: 1907.06465 · 2020-08-10

## TL;DR

This paper discusses the challenges and methods of linking data from different sources while protecting privacy, focusing on confidentiality risks and potential solutions for micro-data sharing.

## Contribution

It introduces principles and methods for data linking, analyzes confidentiality risks, especially the 'intruder' problem, and reviews potential solutions for micro-data release.

## Key findings

- Identification of confidentiality risks in data linking
- Analysis of the 'intruder' problem in data privacy
- Overview of statistical and non-statistical solutions

## Abstract

Data providers such as government statistical agencies perform a balancing act: maximising information published to inform decision-making and research, while simultaneously protecting privacy. The emergence of identified administrative datasets with the potential for sharing (and thus linking) offers huge potential benefits but significant additional risks. This article introduces the principles and methods of linking data across different sources and points in time, focusing on potential areas of risk. We then consider confidentiality risk, focusing in particular on the "intruder" problem central to the area, and looking at both risks from data producer outputs and from the release of micro-data for further analysis. Finally, we briefly consider potential solutions to micro-data release, both the statistical solutions considered in other contributed articles and non-statistical solutions.

---
Source: https://tomesphere.com/paper/1907.06465