A multi-language toolkit for the semi-automated checking of research   outputs

Richard J. Preen; Maha Albashir; Simon Davy; Jim Smith

arXiv:2212.02935·cs.CR·May 5, 2025

A multi-language toolkit for the semi-automated checking of research outputs

Richard J. Preen, Maha Albashir, Simon Davy, Jim Smith

PDF

Open Access 1 Repo

TL;DR

This paper introduces SACRO, an open-source toolkit that semi-automates privacy disclosure checks in research outputs, enhancing collaboration between researchers and data custodians in secure environments.

Contribution

It presents a multi-language, best-practice framework that supports human-in-the-loop privacy checks with automated identification and mitigation of disclosive outputs.

Findings

01

Supports Python, R, and Stata with front-end packages

02

Automates identification of disclosive outputs

03

Provides audit-friendly reports and decision tracking

Abstract

This article presents a free and open source toolkit that supports the semi-automated checking of research outputs (SACRO) for privacy disclosure within secure data environments. SACRO is a framework that applies best-practice principles-based statistical disclosure control (SDC) techniques on-the-fly as researchers conduct their analyses. SACRO is designed to assist human checkers rather than seeking to replace them as with current automated rules-based approaches. The toolkit is composed of a lightweight Python package that sits over well-known analysis tools that produce outputs such as tables, plots, and statistical models. This package adds functionality to (i) automatically identify potentially disclosive outputs against a range of commonly used disclosure tests; (ii) apply optional disclosure mitigation strategies as requested; (iii) report reasons for applying SDC; and (iv)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ai-sdc/acro
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Data Quality and Management