TitleSpherical microaggregation: Anonymizing sparse vector spaces
Publication TypeJournal Article
Year of Publication2015
AuthorsAbril D, Navarro-Arribas G, Torra V
JournalComputers & Security
Keywordsanonymization, data mining, sparse data, vector space

Abstract Unstructured texts are a very popular data type and still widely unexplored in the privacy preserving data mining field. We consider the problem of providing public information about a set of confidential documents. To that end we have developed a method to protect a Vector Space Model (VSM), to make it public even if the documents it represents are private. This method is inspired by microaggregation, a popular protection method from statistical disclosure control, and adapted to work with sparse and high dimensional data sets.