Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
  • Colleges & Schools
  • Statistics
  • All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. College of Science
  3. School of Computer Science
  4. Computer Science Research Collection
  5. Enhancing the Utility of Anonymized Data by Improving the Quality of Generalization Hierarchies
 
  • Details
Options

Enhancing the Utility of Anonymized Data by Improving the Quality of Generalization Hierarchies

File(s)
FileDescriptionSizeFormat
Download EnhancingUtilityTDP2017.pdf1.17 MB
Author(s)
Ayala-Rivera, Vanessa 
McDonagh, Patrick 
Cerqueus, Thomas 
Murphy, Liam, B.E. 
Thorpe, Christina 
Uri
http://hdl.handle.net/10197/9317
Date Issued
April 2017
Date Available
11T15:38:29Z April 2018
Abstract
The dissemination of textual personal information has become an important driver of innovation. However, due to the possible content of sensitive information, this data must be anonymized. A commonly-used technique to anonymize data is generalization. Nevertheless, its effectiveness can be hampered by the Value Generalization Hierarchies (VGHs) used as poorly-specified VGHs can decrease the usefulness of the resulting data. To tackle this problem, in our previous work we presented the Generalization Semantic Loss (GSL), a metric that captures the quality of categorical VGHs in terms of semantic consistency and taxonomic organization. We validated the accuracy of GSL using an intrinsic evaluation with respect to a gold standard ontology. In this paper, we extend our previous work by conducting an extrinsic evaluation of GSL with respect to the performance that VGHs have in anonymization (using data utility metrics). We show how GSL can be used to perform an a priori assessment of the VGHs¿ effectiveness for anonymization. In this manner, data publishers can quantitatively compare the quality of various VGHs and identify (before anonymization) those that better retain the semantics of the original data. Consequently, the utility of the anonymized datasets can be improved without sacrificing the privacy goal. Our results demonstrate the accuracy of GSL, as the quality of VGHs measured with GSL strongly correlates with the utility of the anonymized data. Results also show the benefits that an a priori VGH assessment strategy brings to the anonymization process in terms of time-savings and a reduction in the dependency on expert knowledge. Finally, GSL also proved to be lightweight in terms of computational resources.
Sponsorship
European Commission - European Regional Development Fund
Science Foundation Ireland
Type of Material
Journal Article
Publisher
Transactions on Data Privacy
Journal
Transactions on Data Privacy
Volume
10
Issue
1
Start Page
27
End Page
59
Keywords
  • Privacy

  • Data Publishing

  • Data Quality

  • Generalization Hierar...

  • Data Semantics

Web versions
http://www.tdp.cat/issues16/tdp.a261a16.pdf
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
Owning collection
Computer Science Research Collection
Views
1124
Acquisition Date
Mar 28, 2023
View Details
Downloads
279
Last Week
3
Last Month
8
Acquisition Date
Mar 28, 2023
View Details
google-scholar
University College Dublin Research Repository UCD
The Library, University College Dublin, Belfield, Dublin 4
Phone: +353 (0)1 716 7583
Fax: +353 (0)1 283 7667
Email: mailto:research.repository@ucd.ie
Guide: http://libguides.ucd.ie/rru

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement