Now showing 1 - 3 of 3
  • Publication
    A Systematic Comparison and Evaluation of k-Anonymization Algorithms for Practitioners
    The vast amount of data being collected about individuals has brought new challenges in protecting their privacy when this data is disseminated. As a result, Privacy-Preserving Data Publishing has become an active research area, in which multiple anonymization algorithms have been proposed. However, given the large number of algorithms available and limited information regarding their performance, it is difficult to identify and select the most appropriate algorithm given a particular publishing scenario, especially for practitioners. In this paper, we perform a systematic comparison of three well-known k-anonymization algorithms to measure their efficiency (in terms of resources usage) and their effectiveness (in terms of data utility). We extend the scope of their original evaluation by employing a more comprehensive set of scenarios: different parameters, metrics and datasets. Using publicly available implementations of those algorithms, we conduct a series of experiments and a comprehensive analysis to identify the factors that influence their performance, in order to guide practitioners in the selection of an algorithm. We demonstrate through experimental evaluation, the conditions in which one algorithm outperforms the others for a particular metric, depending on the input dataset and privacy requirements. Our findings motivate the necessity of creating methodologies that provide recommendations about the best algorithm given a particular publishing scenario.
      1837
  • Publication
    Enhancing the Utility of Anonymized Data by Improving the Quality of Generalization Hierarchies
    The dissemination of textual personal information has become an important driver of innovation. However, due to the possible content of sensitive information, this data must be anonymized. A commonly-used technique to anonymize data is generalization. Nevertheless, its effectiveness can be hampered by the Value Generalization Hierarchies (VGHs) used as poorly-specified VGHs can decrease the usefulness of the resulting data. To tackle this problem, in our previous work we presented the Generalization Semantic Loss (GSL), a metric that captures the quality of categorical VGHs in terms of semantic consistency and taxonomic organization. We validated the accuracy of GSL using an intrinsic evaluation with respect to a gold standard ontology. In this paper, we extend our previous work by conducting an extrinsic evaluation of GSL with respect to the performance that VGHs have in anonymization (using data utility metrics). We show how GSL can be used to perform an a priori assessment of the VGHs¿ effectiveness for anonymization. In this manner, data publishers can quantitatively compare the quality of various VGHs and identify (before anonymization) those that better retain the semantics of the original data. Consequently, the utility of the anonymized datasets can be improved without sacrificing the privacy goal. Our results demonstrate the accuracy of GSL, as the quality of VGHs measured with GSL strongly correlates with the utility of the anonymized data. Results also show the benefits that an a priori VGH assessment strategy brings to the anonymization process in terms of time-savings and a reduction in the dependency on expert knowledge. Finally, GSL also proved to be lightweight in terms of computational resources.
      351
  • Publication
    Integration of QoS Metrics, Rules and Semantic Uplift for Advanced IPTV Monitoring
    Increasing and variable traffic demands due to triple play services pose significant Internet Protocol Television (IPTV) resource management challenges for service providers. Managing subscriber expectations via consolidated IPTV quality reporting will play a crucial role in guaranteeing return-on-investment for players in the increasingly competitive IPTV delivery ecosystem. We propose a fault diagnosis and problem isolation solution that addresses the IPTV monitoring challenge and recommends problem-specific remedial action. IPTV delivery-specific metrics are collected at various points in the delivery topology, the residential gateway and the Digital Subscriber Line Access Multiplexer through to the video Head-End. They are then pre-processed using new metric rules. A semantic uplift engine takes these raw metric logs; it then transforms them into World Wide Web Consortium’s standard Resource Description Framework for knowledge representation and annotates them with expert knowledge from the IPTV domain. This system is then integrated with a monitoring visualization framework that displays monitoring events, alarms, and recommends solutions. A suite of IPTV fault scenarios is presented and used to evaluate the feasibility of the solution. We demonstrate that professional service providers can provide timely reports on the quality of IPTV service delivery using this system.
    Scopus© Citations 8  2422