Now showing 1 - 10 of 30
  • Publication
    dSUMO: Towards a Distributed SUMO
    Microscopic urban mobility simulations consist of modelling a city's road network and infrastructure, and to run autonomous individual vehicles to understand accurately what is going on in the city. However, when the scale of the problem space is large or when the processing time is critical, performing such simulations might be problematic as they are very computationally expensive applications. In this paper, we propose to leverage the power of many computing resources to perform quicker or larger microscopic simulations, keeping the same accuracy as the classical simulation running on a single computing unit. We have implemented a distributed version of SUMO, called dSUMO. We show in this paper that the accuracy of the simulation in SUMO is not impacted by the distribution and we give some preliminary results regarding the performance of dSUMO compared to SUMO.
      647
  • Publication
    A Systematic Comparison and Evaluation of k-Anonymization Algorithms for Practitioners
    The vast amount of data being collected about individuals has brought new challenges in protecting their privacy when this data is disseminated. As a result, Privacy-Preserving Data Publishing has become an active research area, in which multiple anonymization algorithms have been proposed. However, given the large number of algorithms available and limited information regarding their performance, it is difficult to identify and select the most appropriate algorithm given a particular publishing scenario, especially for practitioners. In this paper, we perform a systematic comparison of three well-known k-anonymization algorithms to measure their efficiency (in terms of resources usage) and their effectiveness (in terms of data utility). We extend the scope of their original evaluation by employing a more comprehensive set of scenarios: different parameters, metrics and datasets. Using publicly available implementations of those algorithms, we conduct a series of experiments and a comprehensive analysis to identify the factors that influence their performance, in order to guide practitioners in the selection of an algorithm. We demonstrate through experimental evaluation, the conditions in which one algorithm outperforms the others for a particular metric, depending on the input dataset and privacy requirements. Our findings motivate the necessity of creating methodologies that provide recommendations about the best algorithm given a particular publishing scenario.
      1657
  • Publication
    Synthetic Data Generation using Benerator Tool
    (University College Dublin. School of Computer Science and Informatics, 2013-10-29) ; ; ;
    Datasets of different characteristics are needed by the research community for experimental purposes. However, real data may be difficult to obtain due to privacy concerns. Moreover, real data may not meet specific characteristics which are needed to verify new approaches under certain conditions. Given these limitations, the use of synthetic data is a viable alternative to complement the real data. In this report, we describe the process followed to generate synthetic data using Benerator, a publicly available tool. The results show that the synthetic data preserves a high level of accuracy compared to the original data. The generated datasets correspond to microdata containing records with social, economic and demographic data which mimics the distribution of aggregated statistics from the 2011 Irish Census data.
      136
  • Publication
    Towards the Automatic Detection of Efficient Computing Assets in a Heterogeneous Cloud Environment
    (Institute of Electrical and Electronic Engineers (IEEE), 2013-06-03) ; ; ; ;
    In a heterogeneous cloud environment, the manual grading of computing assets is the first step in the process of configuring IT infrastructures to ensure optimal utilization of resources. Grading the efficiency of computing assets is however, a difficult, subjective and time consuming manual task. Thus, an automatic efficiency grading algorithm is highly desirable. In this paper, we compare the effectiveness of the different criteria used in the manual grading task for automatically determining the efficiency grading of a computing asset. We report results on a dataset of 1,200 assets from two different data centers in IBM Toronto. Our preliminary results show that electrical costs (associated with power and cooling) appear to be even more informative than hardware and age based criteria as a means of determining the efficiency grade of an asset. Our analysis also indicates that the effectiveness of the various efficiency criteria is dependent on the asset demographic of the data centre under consideration.
      266
  • Publication
    Network Planning for IEEE 802.16j Relay Networks
    (Auerbach Publications, 2009-04) ; ; ;
    In this chapter, a problem formulation for determining the optimal node location for base stations (BSs) and relay stations (RSs) in relay-based 802.16 networks is developed. A number of techniques are proposed to solve the resulting integer programming (IP) problem—these are compared in terms of the time taken to find a solution and the quality of the solution obtained. Finally, there is some analysis of the impact of the ratio of BS/RS costs on the solutions obtained. Three techniques are studied to solve the IP problem: (1) a standard branch and bound mechanism, (2) an approach in which state space reduction techniques are applied in advance of the branch and bound algorithm, and (3) a clustering approach in which the problem is divided into a number of subproblems which are solved separately, followed by a final overall optimization step. These different approaches were used to solve the problem. The results show that the more basic approach can be used to solve problems for small metropolitan areas; the state space reduction technique reduces the time taken to find a solution by about 50 percent. Finally, the clustering approach can be used to find solutions of approximately equivalent quality in about 30 percent of the time required in the first case. After scalability tests were performed, some rudimentary experiments were performed in which the ratio of BS/RS cost was varied. The initial results show that for the scenarios studied, reducing the RS costs results in more RSs in the solution, while also decreasing the power required to communicate from the mobile device to its closest infrastructure node (BS or RS).
      269
  • Publication
    Enhancing the Utility of Anonymized Data by Improving the Quality of Generalization Hierarchies
    The dissemination of textual personal information has become an important driver of innovation. However, due to the possible content of sensitive information, this data must be anonymized. A commonly-used technique to anonymize data is generalization. Nevertheless, its effectiveness can be hampered by the Value Generalization Hierarchies (VGHs) used as poorly-specified VGHs can decrease the usefulness of the resulting data. To tackle this problem, in our previous work we presented the Generalization Semantic Loss (GSL), a metric that captures the quality of categorical VGHs in terms of semantic consistency and taxonomic organization. We validated the accuracy of GSL using an intrinsic evaluation with respect to a gold standard ontology. In this paper, we extend our previous work by conducting an extrinsic evaluation of GSL with respect to the performance that VGHs have in anonymization (using data utility metrics). We show how GSL can be used to perform an a priori assessment of the VGHs¿ effectiveness for anonymization. In this manner, data publishers can quantitatively compare the quality of various VGHs and identify (before anonymization) those that better retain the semantics of the original data. Consequently, the utility of the anonymized datasets can be improved without sacrificing the privacy goal. Our results demonstrate the accuracy of GSL, as the quality of VGHs measured with GSL strongly correlates with the utility of the anonymized data. Results also show the benefits that an a priori VGH assessment strategy brings to the anonymization process in terms of time-savings and a reduction in the dependency on expert knowledge. Finally, GSL also proved to be lightweight in terms of computational resources.
      276
  • Publication
    Bandwidth Allocation By Pricing In ATM Networks
    (Elsevier, 1994-03) ;
    Admission control and bandwidth allocation are important issues in telecommunications networks, especially when there are random fluctuating demands for service and variations in the service rates. In the emerging broadband communications environment these services are likely to be offered via an ATM network. In order to make ATM future safe, methods for controlling the network should not be based on the characteristics of present services. We propose one bandwidth allocation method which has this property . Our proposed approach is based on pricing bandwidth to reflect network utilization, with users competing for resources according to their individual bandwidth valuations. The prices may be components of an actual tariff or they may be used as control signals, as in a private network. Simulation results show the improvement possible with our scheme versus a leaky bucket method in terms of cell loss probability, and confirm that a small queue with pricing can be efficient to multiplex heterogeneous sources.
      110
  • Publication
    The Role of Responsive Pricing in the Internet
    The Internet continues to evolve as it reaches out to a wider user population. The recent introduction of user-friendly navigation and retrieval tools for the World Wide Web has triggered an unprecedented level of interest in the Internet among the media and the general public, as well as in the technical community. It seems inevitable that some changes or additions are needed in the control mechanisms used to allocate usage of Internet resources. In this paper, we argue that a feedback signal in the form of a variable price for network service is a workable tool to aid network operators in controlling Internet traffic. We suggest that these prices should vary dynamically based on the current utilization of network resources. We show how this responsive pricing puts control of network service back where it belongs: with the users.
      571
  • Publication
    Ontology-Based Quality Evaluation of Value Generalization Hierarchies for Data Anonymization
    In privacy-preserving data publishing, approaches using Value Generalization Hierarchies (VGHs) form an important class of anonymization algorithms. VGHs play a key role in the utility of published datasets as they dictate how the anonymization of the data occurs. For categorical attributes, it is imperative to preserve the semantics of the original data in order to achieve a higher utility. Despite this, semantics have not being formally considered in the specification of VGHs. Moreover, there are no methods that allow the users to assess the quality of their VGH. In this paper, we propose a measurement scheme, based on ontologies, to quantitatively evaluate the quality of VGHs, in terms of semantic consistency and taxonomic organization, with the aim of producing higher-quality anonymizations. We demonstrate, through a case study, how our evaluation scheme can be used to compare the quality of multiple VGHs and can help to identify faulty VGHs.
      122
  • Publication
    Towards an Efficient Performance Testing Through Dynamic Workload Adaptation
    Performance testing is a critical task to ensure an acceptable user experience with software systems, especially when there are high numbers of concurrent users. Selecting an appropriate test workload is a challenging and time-consuming process that relies heavily on the testers’ expertise. Not only are workloads application-dependent, but also it is usually unclear how large a workload must be to expose any performance issues that exist in an application. Previous research has proposed to dynamically adapt the test workloads in real-time based on the application behavior. By reducing the need for the trial-and-error test cycles required when using static workloads, dynamic workload adaptation can reduce the effort and expertise needed to carry out performance testing. However, such approaches usually require testers to properly configure several parameters in order to be effective in identifying workload-dependent performance bugs, which may hinder their usability among practitioners. To address this issue, this paper examines the different criteria needed to conduct performance testing efficiently using dynamic workload adaptation. We present the results of comprehensively evaluating one such approach, providing insights into how to tune it properly in order to obtain better outcomes based on different scenarios. We also study the effects of varying its configuration and how this can affect the results obtained.
      143