Javascript must be enabled to continue!
Federated Data Linkage in Practice
View through CrossRef
In recent years, great strides have been made towards the deployment of federated systems for data research, including exploring federated trusted research environments (TREs). These federated TREs allow data analysts to securely use data within multiple institutions with minimal overheads while preserving information governance. However, due to the sensitive nature of data utilised in data linkage pipelines, especially within privacy-preserving applications, the real-world deployment of federated data linkage is in its early stages.
There are several challenges to overcome with privacy-preserving federated linkage, including securely and efficiently sharing information that enables the identification of record similarities. Another key consideration is the resilience of the federated linkage ecosystem to information being added or removed as data providers move in and out of the initiative, as changes in the record set can cause intermediary links to be destroyed or created. Within non-federated linkage, a spine dataset such as a national census can provide an anchor on which incoming datasets can be linked. In federated linkage this may not explicitly be the case, especially if providers are globally dispersed or such a broad spine does not exist for the use case, and so there may be critical datasets within the overall system.
We present a discussion over the real-world challenges to federated linkage on population-scale applications, including approaches which allow privacy-preserving linkage across multiple TREs and handling information governance requirements. We also explore the impact of the collaborative nature of federated linkage, looking at how quality and record links are affected in practice.
Title: Federated Data Linkage in Practice
Description:
In recent years, great strides have been made towards the deployment of federated systems for data research, including exploring federated trusted research environments (TREs).
These federated TREs allow data analysts to securely use data within multiple institutions with minimal overheads while preserving information governance.
However, due to the sensitive nature of data utilised in data linkage pipelines, especially within privacy-preserving applications, the real-world deployment of federated data linkage is in its early stages.
There are several challenges to overcome with privacy-preserving federated linkage, including securely and efficiently sharing information that enables the identification of record similarities.
Another key consideration is the resilience of the federated linkage ecosystem to information being added or removed as data providers move in and out of the initiative, as changes in the record set can cause intermediary links to be destroyed or created.
Within non-federated linkage, a spine dataset such as a national census can provide an anchor on which incoming datasets can be linked.
In federated linkage this may not explicitly be the case, especially if providers are globally dispersed or such a broad spine does not exist for the use case, and so there may be critical datasets within the overall system.
We present a discussion over the real-world challenges to federated linkage on population-scale applications, including approaches which allow privacy-preserving linkage across multiple TREs and handling information governance requirements.
We also explore the impact of the collaborative nature of federated linkage, looking at how quality and record links are affected in practice.
Related Results
Description of the international consortium for prostate cancer genetics, and failure to replicate linkage of hereditary prostate cancer to 20q13
Description of the international consortium for prostate cancer genetics, and failure to replicate linkage of hereditary prostate cancer to 20q13
AbstractThe International Consortium for Prostate Cancer Genetics (ICPCG) is an international collaborative effort to pool pedigrees with hereditary prostate cancer (PC) in order t...
On a Framework for Federated Cluster Analysis
On a Framework for Federated Cluster Analysis
Federated learning is becoming increasingly popular to enable automated learning in distributed networks of autonomous partners without sharing raw data. Many works focus on superv...
One box to search them all
One box to search them all
PurposeThe purpose of this paper is to present how, in May 2008, the Ad Hoc Committee on Federated Search was formed to prepare a preliminary report on federated searching for a sp...
Image-based crop disease detection with federated learning
Image-based crop disease detection with federated learning
Abstract
Crop disease detection and management is critical to improving productivity, reducing costs, and promoting environmentally friendly crop treatment methods. Modern ...
Cloud-Based Federated Learning Implementation Across Medical Centers
Cloud-Based Federated Learning Implementation Across Medical Centers
PURPOSEBuilding well-performing machine learning (ML) models in health care has always been exigent because of the data-sharing concerns, yet ML approaches often require larger tra...
Evaluation measure for group-based record linkage
Evaluation measure for group-based record linkage
Introduction The robustness of record linkage evaluation measures is of high importance since linkage techniques are assessed based on these. However, minimal research has been con...
Consistently evaluating data linkage classification results
Consistently evaluating data linkage classification results
ObjectivesData linkage is commonly viewed as the problem of classifying record pairs into matches and non-matches. In situations where ground truth data are available, performance ...
Perspectives on linkage to care for patients diagnosed with HIV: A qualitative study at a rural health center in South Western Uganda
Perspectives on linkage to care for patients diagnosed with HIV: A qualitative study at a rural health center in South Western Uganda
Linkage to care for newly diagnosed human immunodeficiency virus (HIV) patients is important to ensure that patients have good access to care. However, there is little information ...

