TitleData mining methods for linking data coming from several sources
Publication TypeConference Paper
Year of Publication2004
AuthorsTorra V, Domingo-Ferrer J, Torres À
Conference NameMonographs in official statistics. 3rd Joint UN/ECE-Eurostat Work session on statistical data confidentiality

Statistical offices are faced with the problem of multiple-database data mining at least for two reasons. On one side, there is a trend to avoid direct collection of data from respondents and use instead administrative data sources to build statistical data; such administrative sources are typically diverses and scattered across several administration level. On the other side, intruders may attempt disclosure of confidential statistical data by using the same approach, i.e. by linking whatever databases they can obtain. This paper discusses issues related to multipledatabase data mining, with a special focus on a method for linking records across databases which do not share any variables.