Safe machine learning model release from Trusted Research Environments: The SACRO-ML package

Jim C Smith, Richard J. Preen, Andrew McCarthy, Maha Albashir, Alba Crespi-Boixader, Shahzad Mumtaz, James Liley, Simon Rogers, Yola Jones

Research output: Working paperPreprint

Abstract

We present SACRO-ML, an integrated suite of open source Python tools to facilitate the statistical disclosure control (SDC) of machine learning (ML) models trained on confidential data prior to public release. SACRO-ML combines (i) a SafeModel package that extends commonly used ML models to provide ante-hoc SDC by assessing the vulnerability of disclosure posed by the training regime; and (ii) an Attacks package that provides post-hoc SDC by rigorously assessing the empirical disclosure risk of a model through a variety of simulated attacks after training. The SACRO-ML code and documentation are available under an MIT license at https://github.com/AI-SDC/SACRO-ML
Original languageEnglish
PublisherArXiv
DOIs
Publication statusPublished - 2 Dec 2022

Funding

This work was funded by UK Research and Innovation under Grant Numbers MC_PC_21033 and MC_PC_23006 as part of Phase 1 of the Data and Analytics Research Environments UK (DARE UK) programme, delivered in partnership with Health Data Research UK (HDR UK) and Administrative Data Research UK (ADR UK). The specific projects were Semi-Automatic checking of Research Outputs (SACRO; MC_PC_23006) and Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMATTER; MC_PC_21033). It has also been supported by MRC and EPSRC (PICTURES; MR/S010351/1).

FundersFunder number
UK Research and Innovation MC_PC_21033, MC_PC_23006, MR/S010351/1

    Keywords

    • cs.LG
    • cs.CR
    • cs.IR

    Fingerprint

    Dive into the research topics of 'Safe machine learning model release from Trusted Research Environments: The SACRO-ML package'. Together they form a unique fingerprint.

    Cite this