# Development and quality appraisal of a new English breast screening linked data set as part of the age, test threshold, and frequency of mammography screening (ATHENA-M) study

**Authors:** Julia Brettschneider, Breanna Morrison, David Jenkinson, Karoline Freeman, Jackie Walton, Alice Sitch, Sue Hudson, Olive Kearins, Alice Mansbridge, Sarah E Pinder, Rosalind Given-Wilson, Louise Wilkinson, Matthew G Wallis, Shan Cheung, Sian Taylor-Phillips

PMC · DOI: 10.1093/bjr/tqad023 · 2023-12-12

## TL;DR

Researchers created a comprehensive dataset tracking breast cancer screening and outcomes in England from 1997 to 2018, which can help improve screening practices.

## Contribution

The ATHENA-M project developed the most complete dataset of English breast screening records and outcomes to date.

## Key findings

- A dataset was created linking screening records to cancer and mortality data for over 11 million women.
- Data quality was high from 1997 onward, with over 99% completeness for core screening variables.
- The dataset includes 139 million person-years of follow-up and over 73,000 breast cancer deaths.

## Abstract

To build a data set capturing the whole breast cancer screening journey from individual breast cancer screening records to outcomes and assess data quality.

Routine screening records (invitation, attendance, test results) from all 79 English NHS breast screening centres between January 1, 1988 and March 31, 2018 were linked to cancer registry (cancer characteristics and treatment) and national mortality data. Data quality was assessed using comparability, validity, timeliness, and completeness.

Screening records were extracted from 76/79 English breast screening centres, 3/79 were not possible due to software issues. Data linkage was successful from 1997 after introduction of a universal identifier for women (NHS number). Prior to 1997 outcome data are incomplete due to linkage issues, reducing validity. Between January 1, 1997 and March 31, 2018, a total of 11 262 730 women were offered screening of whom 9 371 973 attended at least one appointment, with 139 million person-years of follow-up (a median of 12.4 person years for each woman included) with 73 810 breast cancer deaths and 1 111 139 any-cause deaths. Comparability to reference data sets and internal validity were demonstrated. Data completeness was high for core screening variables (>99%) and main cancer outcomes (>95%).

The ATHENA-M project has created a large high-quality and representative data set of individual women’s screening trajectories and outcomes in England from 1997 to 2018, data before 1997 are lower quality.

This is the most complete data set of English breast screening records and outcomes constructed to date, which can be used to evaluate and optimize screening.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** deaths (MESH:D003643), breast cancer (MESH:D001943), cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11027252/full.md

---
Source: https://tomesphere.com/paper/PMC11027252