Improving the Utility of Anonymized Datasets through Dynamic Evaluation of Generalization Hierarchies

Files in This Item:
File Description SizeFormat 
DynamicEvaluationOfGeneralizationHierarchies.pdf455.51 kBAdobe PDFDownload
Title: Improving the Utility of Anonymized Datasets through Dynamic Evaluation of Generalization Hierarchies
Authors: Ayala-Rivera, Vanessa
Cerqueus, Thomas
Murphy, Liam, B.E.
Thorpe, Christina
Permanent link: http://hdl.handle.net/10197/8767
Date: 30-Jul-2016
Abstract: The dissemination of textual personal information has become a key driver for innovation and value creation. However, due to the possible content of sensitive information, this data must be anonymized, which can reduce its usefulness for secondary uses. One of the most used techniques to anonymize data is generalization. However, its effectiveness can be hampered by the Value Generalization Hierarchies (VGHs) used to dictate the anonymization of data, as poorly-specified VGHs can reduce the usefulness of the resulting data. To tackle this problem, we propose a metric for evaluating the quality of textual VGHs used in anonymization. Our evaluation approach considers the semantic properties of VGHs and exploits information from the input datasets to predict with higher accuracy (compared to existing approaches) the potential effectiveness of VGHs for anonymizing data. As a consequence, the utility of the resulting datasets is improved without sacrificing the privacy goal. We also introduce a novel rating scale to classify the quality of the VGHs into categories to facilitate the interpretation of our quality metric for practitioners.
Funding Details: Science Foundation Ireland
Type of material: Conference Publication
Publisher: IEEE
Keywords: Anonymization;Privacy;Data publishing;Data quality;Generalization hierarchies;Data semantics
DOI: 10.1109/IRI.2016.13
Language: en
Status of Item: Peer reviewed
Conference Details: IEEE 17th International Conference on Information Reuse and Integration (IRI), Pittsburgh, PA, USA, July, 2016
Appears in Collections:Computer Science Research Collection

Show full item record

Download(s) 50

25
checked on May 25, 2018

Google ScholarTM

Check

Altmetric


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.