COCOA: A Synthetic Data Generator for Testing Anonymization Techniques

Files in This Item:
File Description SizeFormat 
COCOA_PSD2016.pdf1.22 MBAdobe PDFDownload
Title: COCOA: A Synthetic Data Generator for Testing Anonymization Techniques
Authors: Ayala-Rivera, Vanessa
Portillo Dominguez, Andres Omar
Murphy, Liam, B.E.
Thorpe, Christina
Permanent link: http://hdl.handle.net/10197/8763
Date: 16-Sep-2016
Abstract: Conducting extensive testing of anonymization techniques is critical to assess their robustness and identify the scenarios where they are most suitable. However, the access to real microdata is highly restricted and the one that is publicly-available is usually anonymized or aggregated; hence, reducing its value for testing purposes. In this paper, we present a framework (COCOA) for the generation of realistic synthetic microdata that allows to define multi-attribute relationships in order to preserve the functional dependencies of the data. We prove how COCOA is useful to strengthen the testing of anonymization techniques by broadening the number and diversity of the test scenarios. Results also show how COCOA is practical to generate large datasets.
Funding Details: Science Foundation Ireland
Type of material: Conference Publication
Publisher: Springer
Copyright (published version): 2017 Springer
Keywords: Synthetic dataAnonymizationTestingData privacy
DOI: 10.1007/978-3-319-45381-1_13
Language: en
Status of Item: Peer reviewed
Is part of: Domingo-Ferrer, J., Pejić-Bach, M. (eds.). Lecture Notes in Computer Science, volume 9867
Conference Details: UNESCO Chair in Data Privacy, International Conference, PSD 2016, Dubrovnik, Croatia, September 14–16, 2016
Appears in Collections:Computer Science Research Collection
PEL Research Collection

Show full item record

SCOPUSTM   
Citations 50

2
Last Week
0
Last month
checked on Oct 19, 2018

Download(s) 50

92
checked on May 25, 2018

Google ScholarTM

Check

Altmetric


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.