Enhancing the utility of anonymized data in privacy-preserving data publishing

Files in This Item:
Access to this item has been restricted by the copyright holder until:2020-12-31
File Description SizeFormat 
AyalaRivera_ucd_5090D_10132.pdf9.48 MBAdobe PDFDownload    Request a copy
Title: Enhancing the utility of anonymized data in privacy-preserving data publishing
Authors: Ayala-Rivera, Vanessa
Advisor: Murphy, Liam
Thorpe, Christina
Permanent link: http://hdl.handle.net/10197/8600
Date: 2017
Abstract: The collection, publication, and mining of personal data have become key drivers of innovation and value creation. In this context, it is vital that organizations comply with the pertinent data protection laws to safeguard the privacy of the individuals and prevent the uncontrolled disclosure of their information (especially of sensitive data). However, data anonymization is a time-consuming, error-prone, and complex process that requires a high level of expertise in data privacy and domain knowledge. Otherwise, the quality of the anonymized data and the robustness of its privacy protection would be compromised. This thesis contributes to the area of Privacy-Preserving Data Publishing by proposing a set of techniques that help users to make informed decisions on publishing safe and useful anonymized data, while reducing the expert knowledge and effort required to apply anonymization. In particular, the main contributions of this thesis are: (1) A novel method to evaluate, in an objective, quantifiable, and automatic way, the semantic quality of VGHs for categorical data. By improving the specification of the VGHs, the quality of the anonymized data is also improved. (2) A framework for the automatic construction and multi-dimensional evaluation of VGHs. The aim is to generate VGHs more efficiently and of better quality than when manually done. Moreover, the evaluation of VGHs is enhanced as users can compare VGHs from various perspectives and select the ones that better fit their preferences to drive the anonymization of data. (3) A practical approach for the generation of realistic synthetic datasets which preserves the functional dependencies of the data. The aim is to strengthen the testing of anonymization techniques by broadening the number and diversity of the test scenarios. (4) A conceptual framework that describes a set of relevant elements that underlie the assessment and selection of anonymization algorithms. Also, a systematic comparison and analysis of a set of anonymization algorithms to identify the factors that influence their performance, in order to guide users in the selection of a suitable algorithm.
Type of material: Doctoral Thesis
Publisher: University College Dublin. School of Computer Science  
Qualification Name: Ph.D.
Copyright (published version): 2017 the author
Keywords: AnonymizationData PrivacyData PublishingGeneralization HierarchiesKnowledge-Based SystemSynthetic Data Generation
Other versions: http://dissertations.umi.com/ucd:10132
Language: en
Status of Item: Peer reviewed
Appears in Collections:Computer Science Theses

Show full item record

Google ScholarTM

Check


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.