TitleSupervised learning using mahalanobis distance for record linkage
Publication TypeConference Proceedings
Year of Conference2011
AuthorsAbril D, Navarro-Arribas G, Torra V
EditorDe Baets B, Mesiar R, Troiano L
Conference LocationUniv. of Sannio, Benevento, Italy
Date Published11/07/2011
ISBN Number978-1-4477-7019-0
KeywordsChoquet integral, Data Privacy, Disclosure risk, fuzzy measure, Mahalanobis distance, record linkage

In data privacy, record linkage is a well known technique used to evaluate the disclosure risk of protected data. Mainly, the idea is the linkage between records of different databases, which make reference to the same individuals. In this paper we introduce a new parametrized variation of record linkage relying on the Mahalanobis distance, and a supervised learning method to determine the optimum simulated covariance matrix for the linkage process. We evaluate and compare our proposal with other studied parametrized and not parametrized variations of record linkage, such as weighted mean or the Choquet integral, which determines the optimal fuzzy measure.