Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Evaluation measure for group-based record linkage

View through CrossRef
Introduction The robustness of record linkage evaluation measures is of high importance since linkage techniques are assessed based on these. However, minimal research has been conducted to evaluate the suitability of existing evaluation measures in the context of linking groups of records. Linkage quality is generally evaluated based on traditional measures such as precision and recall. As we show, these traditional evaluation measures are not suitable for evaluating groups of linked records because they evaluate the quality of individual record pairs rather than the quality of records grouped into clusters. Objectives We highlight the shortcomings of traditional evaluation measures and then propose a novel method to evaluate clustering quality in the context of group-based record linkage. Methods The proposed linkage evaluation method assesses how well individual records have been allocated into predicted groups/clusters with respect to ground-truth data. We first identify the best representative predicted cluster for each ground-truth cluster and, based on the resulting mapping, each record in a ground-truth cluster is assigned to one of seven categories. These categories reflect how well the linkage technique assigned records into groups. Results We empirically evaluate our proposed method using real-world data and show that it better reflects the quality of clusters generated by three group-based record linkage techniques. We also show that traditional measures such as precision and recall can produce ambiguous results whereas our method does not. Conclusions The proposed evaluation method provides unambiguous results regarding the assessed group-based record linkage approaches. The method comprises of seven categories which reflect how each record was predicted, providing more detailed information about the quality of the linkage result. This will help to make better-informed decisions about which linkage technique is best suited for a given linkage application.
Title: Evaluation measure for group-based record linkage
Description:
Introduction The robustness of record linkage evaluation measures is of high importance since linkage techniques are assessed based on these.
However, minimal research has been conducted to evaluate the suitability of existing evaluation measures in the context of linking groups of records.
Linkage quality is generally evaluated based on traditional measures such as precision and recall.
As we show, these traditional evaluation measures are not suitable for evaluating groups of linked records because they evaluate the quality of individual record pairs rather than the quality of records grouped into clusters.
Objectives We highlight the shortcomings of traditional evaluation measures and then propose a novel method to evaluate clustering quality in the context of group-based record linkage.
Methods The proposed linkage evaluation method assesses how well individual records have been allocated into predicted groups/clusters with respect to ground-truth data.
We first identify the best representative predicted cluster for each ground-truth cluster and, based on the resulting mapping, each record in a ground-truth cluster is assigned to one of seven categories.
These categories reflect how well the linkage technique assigned records into groups.
Results We empirically evaluate our proposed method using real-world data and show that it better reflects the quality of clusters generated by three group-based record linkage techniques.
We also show that traditional measures such as precision and recall can produce ambiguous results whereas our method does not.
Conclusions The proposed evaluation method provides unambiguous results regarding the assessed group-based record linkage approaches.
The method comprises of seven categories which reflect how each record was predicted, providing more detailed information about the quality of the linkage result.
This will help to make better-informed decisions about which linkage technique is best suited for a given linkage application.

Related Results

Młodociani sprawcy przestępstw przeciwko mieniu
Młodociani sprawcy przestępstw przeciwko mieniu
The new Polish penal legislation of 1969 introduced special rules of criminal liability of young adult offenders' aged 17-20. In 1972 criminological research was undertaken in orde...
Italian Ornithological Commission (COI) - Report 30
Italian Ornithological Commission (COI) - Report 30
Italian Ornithological Commission (COI) - Report 30. This report refers to records from January 1st 2020 to December 31st 2021, with the addition of a number of records from previo...
Consistently evaluating data linkage classification results
Consistently evaluating data linkage classification results
ObjectivesData linkage is commonly viewed as the problem of classifying record pairs into matches and non-matches. In situations where ground truth data are available, performance ...
Measurable Progress? Teaching Artsworkers to Assess and Articulate the Impact of Their Work
Measurable Progress? Teaching Artsworkers to Assess and Articulate the Impact of Their Work
The National Cultural Policy Discussion Paper—drafted to assist the Australian Government in developing the first national Cultural Policy since Creative Nation nearly two decades ...
Non-Recommended Publishing Lists: Strategies for Detecting Deceitful Journals
Non-Recommended Publishing Lists: Strategies for Detecting Deceitful Journals
Abstract The rapid growth of open access publishing (OAP) has significantly improved the accessibility and dissemination of scientific knowledge. However, this expansion has also c...
Commissione Ornitologica Italiana (COI) - Report 28
Commissione Ornitologica Italiana (COI) - Report 28
Italian Ornithological Commission (COI) - Report 28. This report refers to records from 2018, with the addition of a number of records from previous years which were submitted more...
Italian Ornithological Commission (COI) - Report 31
Italian Ornithological Commission (COI) - Report 31
Italian Ornithological Commission (COI) - Report 31. This report refers to records from January 1st 2022 to December 31st 2022, with the addition of a number of records from previo...

Back to Top