Bluemetrix’s Data Masking and Tokenization enables rapid UK Covid-19 research
Since the start of the pandemic, we have often heard officials say that ‘we’re following the science’. From politicians explaining lockdown measures to scientists discussing the logic behind vaccine rollouts, all eyes have been on the ‘science’.
Yet what’s at the very heart of the science? When it comes down to it, it’s data. If you take the UK’s largest healthcare provider, they have gathered millions of data points about tens of thousands of patients who have suffered in a variety of ways from Covid-19.
Now, just because we’re living in extraordinary times, you might think that the healthcare provider might forego data protection policies to possibly speed up their processes. However, nothing could be further from the truth.
Teaming up with Bluemetrix
Last year, the healthcare provider teamed up with Bluemetrix to support their cloud-based informatics solution which, in turn, would allow their staff to access de-identified data. Once the data was migrated successfully to the cloud, the healthcare provider was able to manage the control of the data in terms of access, security while allowing a full audit of every action that healthcare staff did while interacting with the data.
Once the data was in the cloud, it not only allowed the healthcare provider to analyse the data, for the first time, they could share the data with third parties. This was a vital and much-needed step as the research into Covid-19 involved many stakeholders.
Furthermore, the audited data was accessible to both the healthcare provider and the Data Protection Officer.
Early on in the pandemic, the healthcare provider realised that by moving their data to the cloud, they could create a solution that would allow for the fast processing of data. However, moving such large volumes of critical data – critical both in terms of what the data had captured, and its end purpose i.e. supporting patient care and Covid research – required a robust solution.
By implementing Bluemetrix’s BDM platform, data masking and tokenization was used to secure the data before it was migrated to the cloud, which then could be shared with the researchers and staff within the healthcare provider and with third parties. Also, when any stakeholders interacted with the data – from the healthcare provider to third parties – the integrity of the data remained constant and reliable.
Data masking and tokenization
Data masking and tokenization allowed the original data to be masked in such a way that it kept in line with data protection policies yet are useful to researchers.
Technically, the data could be safely and securely accessed using R and Python to connect to the Hadoop cluster, where the relevant Covid-19 data sat.
In conjunction with the healthcare provider and other key stakeholders, Bluemetrix created a data analytics platform that brought together data from multiple providers and data owners thus creating a complete picture of all Covid-19 related information.
This dataset included primary care, acute, mental health, community and social care data to provide a linked dataset that was depersonalised.
As the data had many owners and was from many