On linkage bias-correction for estimators using iterated bootstraps
By: Siu-Ming Tam , Min Wang , Alicia Rambaldi and more
Potential Business Impact:
Fixes errors in combined data for better results.
By amalgamating data from disparate sources, the resulting integrated dataset becomes a valuable resource for statistical analysis. In probabilistic record linkage, the effectiveness of such integration relies on the availability of linkage variables free from errors. Where this is lacking, the linked data set would suffer from linkage errors and the resultant analyses, linkage bias. This paper proposes a methodology leveraging the bootstrap technique to devise linkage bias-corrected estimators. Additionally, it introduces a test to assess whether increasing the number of bootstrap iterations meaningfully reduces linkage bias or merely inflates variance without further improving accuracy. An application of these methodologies is demonstrated through the analysis of a simulated dataset featuring hormone information, along with a dataset obtained from linking two data sets from the Australian Bureau of Statistics' labour mobility surveys.
Similar Papers
Regression Analysis After Bipartite Bayesian Record Linkage
Methodology
Better links improve study results.
Relaxing the Assumption of Strongly Non-Informative Linkage Error in Secondary Regression Analysis of Linked Files
Methodology
Fixes mistakes when combining different data.
Bootstrap Nonparametric Inference under Data Integration
Methodology
Helps computers learn from different data sources.