Now showing 1 - 10 of 21
  • Publication
    Towards an automated approach to use expert systems in the performance testing of distributed systems
    Performance testing in distributed environments is challenging. Specifically, the identification of performance issues and their root causes are time-consuming and complex tasks which heavily rely on expertise. To simplify these tasks, many researchers have been developing tools with built-in expertise. However limitations exist in these tools, such as managing huge volumes of distributed data, that prevent their efficient usage for performance testing of highly distributed environments. To address these limitations, this paper presents an adaptive framework to automate the usage of expert systems in performance testing. Our validation assessed the accuracy of the framework and the time savings that it brings to testers. The results proved the benefits of the framework by achieving a significant decrease in the time invested in performance analysis and testing
      318Scopus© Citations 5
  • Publication
    Adaptive GC-aware load balancing strategy for high-assurance Java distributed systems
    High-Assurance applications usually require achieving fast response time and high throughput on a constant basis. To fulfil these stringent quality of service requirements, these applications are commonly deployed in clustered instances. However, how to effectively manage these clusters has become a new challenge. A common approach is to deploy a front-end load balancer to optimise the workload distribution among the clustered applications. Thus, researchers have been studying how to improve the effectiveness of a load balancer. Our previous work presented a novel load balancing strategy which improves the performance of a distributed Java system by avoiding the performance impacts of Major Garbage Collection, which is a common cause of performance degradation in Java applications. However, as that strategy used a static configuration, it could only improve the performance of a system if the strategy was configured with domain expert knowledge. This paper extends our previous work by presenting an adaptive GC-aware load balancing strategy which self-configures according to the GC characteristics of the application. Our results have shown that this adaptive strategy can achieve higher throughput and lower response time, compared to the round-robin load balancing, while also avoiding the burden of manual tuning.
      346Scopus© Citations 9
  • Publication
    Towards an Efficient Performance Testing Through Dynamic Workload Adaptation
    Performance testing is a critical task to ensure an acceptable user experience with software systems, especially when there are high numbers of concurrent users. Selecting an appropriate test workload is a challenging and time-consuming process that relies heavily on the testers’ expertise. Not only are workloads application-dependent, but also it is usually unclear how large a workload must be to expose any performance issues that exist in an application. Previous research has proposed to dynamically adapt the test workloads in real-time based on the application behavior. By reducing the need for the trial-and-error test cycles required when using static workloads, dynamic workload adaptation can reduce the effort and expertise needed to carry out performance testing. However, such approaches usually require testers to properly configure several parameters in order to be effective in identifying workload-dependent performance bugs, which may hinder their usability among practitioners. To address this issue, this paper examines the different criteria needed to conduct performance testing efficiently using dynamic workload adaptation. We present the results of comprehensively evaluating one such approach, providing insights into how to tune it properly in order to obtain better outcomes based on different scenarios. We also study the effects of varying its configuration and how this can affect the results obtained.
  • Publication
    Choosing Machine Learning Algorithms for Anomaly Detection in Smart Building IoT Scenarios
    Internet of Things (IoT) systems produce large amounts of raw data in the form of log files. This raw data must then be processed to extract useful information. Machine Learning (ML) has proved to be an efficient technique for such tasks, but there are many different ML algorithms available, each suited to different types of scenarios. In this work, we compare the performance of 22 state-of-the-art supervised ML classification algorithms on different IoT datasets, when applied to the problem of anomaly detection. Our results show that there is no dominant solution, and that for each scenario, several candidate techniques perform similarly. Based on our results and a characterization of our datasets, we propose a recommendation framework which guides practitioners towards the subset of the 22 ML algorithms which is likely to perform best on their data.
      495Scopus© Citations 6
  • Publication
    TRINI: an adaptive load balancing strategy based on garbage collection for clustered Java systems
    Nowadays, clustered environments are commonly used in high-performance computing and enterprise-level applications to achieve faster response time and higher throughput than single machine environments. Nevertheless, how to effectively manage the workloads in these clusters has become a new challenge. As a load balancer is typically used to distribute the workload among the cluster's nodes, multiple research efforts have concentrated on enhancing the capabilities of load balancers. Our previous work presented a novel adaptive load balancing strategy (TRINI) that improves the performance of a clustered Java system by avoiding the performance impacts of major garbage collection, which is an important cause of performance degradation in Java. The aim of this paper is to strengthen the validation of TRINI by extending its experimental evaluation in terms of generality, scalability and reliability. Our results have shown that TRINI can achieve significant performance improvements, as well as a consistent behaviour, when it is applied to a set of commonly used load balancing algorithms, demonstrating its generality. TRINI also proved to be scalable across different cluster sizes, as its performance improvements did not noticeably degrade when increasing the cluster size. Finally, TRINI exhibited reliable behaviour over extended time periods, introducing only a small overhead to the cluster in such conditions. These results offer practitioners a valuable reference regarding the benefits that a load balancing strategy, based on garbage collection, can bring to a clustered Java system.
      370Scopus© Citations 11
  • Publication
    A Requirements-based Approach for the Evaluation of Emulated IoT Systems
    The Internet of Things (IoT) has become a major technological revolution. Evaluating any IoT advancements comprehensively is critical to understand the conditions under which they can be more useful, as well as to assess the robustness and efficiency of IoT systems to validate them before their deployment in real life. Nevertheless, the creation of an appropriate IoT test environment is a difficult, effort-intensive, and expensive task; typically requiring a significant amount of human effort and physical hardware to build it. To tackle this problem, emulation tools to test IoT devices have been proposed. However, there is a lack of systematic approaches for evaluating IoT emulation environments. In this paper, we present a requirements-based framework to enable the systematic evaluation of the suitability of an emulated IoT environment to fulfil the requirements that secure the quality of an adequate test environment for IoT.
      322Scopus© Citations 1
  • Publication
    Towards an Efficient Log Data Protection in Software Systems through Data Minimization and Anonymization
    IT infrastructures of companies generate large amounts of log data every day. These logs are typically analyzed by software engineers to gain insights about activities occurring within a company (e.g., to debug issues exhibited by the production systems). To facilitate this process, log data management is often outsourced to cloud providers. However, logs may contain information that is sensitive by nature and considered personal identifiable under most of the new privacy protection laws, such as the European General Data Protection Regulation (GDPR). To ensure that companies do not violate regulatory compliance, they must adopt, in their software systems, appropriate data protection measures. Such privacy protection laws also promote the use of anonymization techniques as possible mechanisms to operationalize data protection. However, companies struggle to put anonymization in practice due to the lack of integrated, intuitive, and easy-to-use tools that accommodate effectively with their log management systems. In this paper, we propose an automatic approach (SafeLog) to filter out information and anonymize log streams to safeguard the confidentiality of sensitive data and prevent its exposure and misuse from third parties. Our results show that atomic anonymization operations can be effectively applied to log streams to preserve the confidentiality of information, while still allowing to conduct different types of analysis tasks such as users behavior, and anomaly detection. Our approach also reduces the amount of data sent to cloud vendors, hence decreasing the financial costs and the risk of overexposing information.
      149Scopus© Citations 2
  • Publication
    In-Test Adaptation of Workload in Enterprise Application Performance Testing
    Performance testing is used to assess if an enterprise application can fulfil its expected Service Level Agreements. However, since some performance issues depend on the input workloads, it is common to use time-consuming and complex iterative test methods, which heavily rely on human expertise. This paper presents an automated approach to dynamically adapt the workload so that issues (e.g. bottlenecks) can be identified more quickly as well as with less effort and expertise. We present promising results from an initial validation prototype indicating an 18-fold decrease in the test time without compromising the accuracy of the test results, while only introducing a marginal overhead in the system.
      422Scopus© Citations 5
  • Publication
    COCOA: A Synthetic Data Generator for Testing Anonymization Techniques
    Conducting extensive testing of anonymization techniques is critical to assess their robustness and identify the scenarios where they are most suitable. However, the access to real microdata is highly restricted and the one that is publicly-available is usually anonymized or aggregated; hence, reducing its value for testing purposes. In this paper, we present a framework (COCOA) for the generation of realistic synthetic microdata that allows to define multi-attribute relationships in order to preserve the functional dependencies of the data. We prove how COCOA is useful to strengthen the testing of anonymization techniques by broadening the number and diversity of the test scenarios. Results also show how COCOA is practical to generate large datasets.
      832Scopus© Citations 8
  • Publication
    Improving the Testing of Clustered Systems Through the Effective Usage of Java Benchmarks
    Nowadays, cluster computing has become a cost-effective and powerful solution for enterprise-level applications. Nevertheless, the usage of this architecture model also increases the complexity of the applications, complicating all activities related to performance optimisation. Thus, many research works have pursued to develop advancements for improving the performance of clusters. Comprehensively evaluating such advancements is key to understand the conditions under which they can be more useful. However, the creation of an appropriate test environment, that is, one which offers different application behaviours (so that the obtained conclusions can be better generalised) is typically an effort-intensive task. To help tackle this problem, this paper presents a tool that helps to decrease the effort and expertise needed to build useful test environments to perform more robust cluster testing. This is achieved by enabling the effective usage of Java Benchmarks to easily create clustered test environments; hence, diversifying the application behaviours that can be evaluated. We also present the results of a practical validation of the proposed tool, where it has been successfully applied to the evaluation of two cluster-related advancements.