Enhancing the utility of anonymized data in privacy-preserving data publishing
Files in This Item:
|AyalaRivera_ucd_5090D_10132.pdf||9.48 MB||Adobe PDF||Download Request a copy|
|Title:||Enhancing the utility of anonymized data in privacy-preserving data publishing||Authors:||Ayala-Rivera, Vanessa||Advisor:||Murphy, Liam
|Permanent link:||http://hdl.handle.net/10197/8600||Date:||2017||Abstract:||The collection, publication, and mining of personal data have become key drivers of innovation and value creation. In this context, it is vital that organizations comply with the pertinent data protection laws to safeguard the privacy of the individuals and prevent the uncontrolled disclosure of their information (especially of sensitive data). However, data anonymization is a time-consuming, error-prone, and complex process that requires a high level of expertise in data privacy and domain knowledge. Otherwise, the quality of the anonymized data and the robustness of its privacy protection would be compromised. This thesis contributes to the area of Privacy-Preserving Data Publishing by proposing a set of techniques that help users to make informed decisions on publishing safe and useful anonymized data, while reducing the expert knowledge and effort required to apply anonymization. In particular, the main contributions of this thesis are: (1) A novel method to evaluate, in an objective, quantifiable, and automatic way, the semantic quality of VGHs for categorical data. By improving the specification of the VGHs, the quality of the anonymized data is also improved. (2) A framework for the automatic construction and multi-dimensional evaluation of VGHs. The aim is to generate VGHs more efficiently and of better quality than when manually done. Moreover, the evaluation of VGHs is enhanced as users can compare VGHs from various perspectives and select the ones that better fit their preferences to drive the anonymization of data. (3) A practical approach for the generation of realistic synthetic datasets which preserves the functional dependencies of the data. The aim is to strengthen the testing of anonymization techniques by broadening the number and diversity of the test scenarios. (4) A conceptual framework that describes a set of relevant elements that underlie the assessment and selection of anonymization algorithms. Also, a systematic comparison and analysis of a set of anonymization algorithms to identify the factors that influence their performance, in order to guide users in the selection of a suitable algorithm.||Type of material:||Doctoral Thesis||Publisher:||University College Dublin. School of Computer Science||Qualification Name:||Ph.D.||Copyright (published version):||2017 the author||Keywords:||Anonymization; Data Privacy; Data Publishing; Generalization Hierarchies; Knowledge-Based System; Synthetic Data Generation||Other versions:||http://dissertations.umi.com/ucd:10132||Language:||en||Status of Item:||Peer reviewed|
|Appears in Collections:||Computer Science Theses|
Show full item record
This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.