The series hosts a seminar every other week on current research topics. The seminar features an invited guest speaker and is usually held between 11am-12pm on Friday. Professors Emre Barut ([email protected]), Joseph Gastwirth ([email protected]), and Qing Pan ([email protected]) are the Seminar Series Coordinators.

Upcoming Seminar

Date: Friday, April 26th, 11:00am-12:00pm

Location: Duques Hall, Room 152

Title: Linear Regression with Linked Data Files

Speaker: Emanuel Ben-David, Census Bureau

Abstract: Large organizations that own or have access to multiple data sources regularly rely on data integration for conducting large-scale scientific projects. Record linkage, or entity resolution, is an essential task in data integration. The task is to identify which records in different datasets belong to the same entity. In practice, due to the lack of unique identifiers, record linkage is prone to matching errors: false matches and missed matches. Statistical analysis of linked data files, even with low matching error, can then suffer from selection bias and adverse outliers. To adjust the analysis, it is of interest to develop statistical methods that can alleviate the adverse effects of matching errors. In this talk, I consider the regression analysis of “permuted data” in which the record linkage results in an unknown permutation of the observations for the response variable. Assuming that the matching error is small, I propose an approach for estimating the parameters that is statistically sound and computationally feasible.