Centre for Health and Bioinformatics

CHaBi seminar - Paul Burton

DataSHIELD: taking the analysis to the data not the data to the analysis

Date/Time: 18th May 2017 16.00

Venue: Baddiley-Clark Seminar Room

DataSHIELD (www.datashield.ac.uk, www.github.com/datashield ) is an innovative software tool enabling secure remote analysis (or joint, parallelized analysis) of individual-person-data (microdata) from one or several data-sources simultaneously. Security is underpinned by preventing access to, or visualisation of, the individual-level data themselves and by proactively blocking potentially disclosive analytic output. By avoiding the physical sharing of microdata, DataSHIELD can mitigate governance and intellectual property (IP) concerns that otherwise constrain data-sharing. It also circumvents the risk that when a data-set is physically shared with a third-party, its original custodian(s) will lose control over its ultimate fate and it could end up being copied to a jurisdiction with weak governance.

In this lecture, I will explain how DataSHIELD works, outline the practicalities of its implementation which is based entirely on open-source freeware - R and OPAL - and discuss its (growing) range of applications and potential extensions.

