Computer Science Theses
Permanent URI for this collection
This collection is made up of doctoral and master theses by research, which have been received in accordance with university regulations.
For more information, please visit the UCD Library Theses Information guide.
Browsing Computer Science Theses by Issue Date
Now showing 1 - 20 of 49
Results Per Page
- PublicationEnabling the remote acquisition of digital forensic evidence through secure data transmission and verification(University College Dublin. School of Computer Science , 2009)
;Providing the ability to any law enforcement officer to remotely transfer an image from any suspect computer directly to a forensic laboratory for analysis, can only help to greatly reduce the time wasted by forensic investigators in conducting on-site collection of computer equipment. RAFT (Remote Acquisition Forensic Tool) is a system designed to facilitate forensic investigators by remotely gathering digital evidence. This is achieved through the implementation of a secure, verifiable client/server imaging architecture. The RAFT system is designed to be relatively easy to use, requiring minimal technical knowledge on behalf of the user. One of the key focuses of RAFT is to ensure that the evidence it gathers remotely is court admissible. This is achieved by ensuring that the image taken using RAFT is verified to be identical to the original evidence on a suspect computer. 207
- PublicationInternalising interaction protocols as first-class programming elements in multi agent systems(University College Dublin. School of Computer Science and Informatics, 2012)
; ;Since their inception, Multi Agent Systems (MASs) have been championed as a solution for the increasing problem of software complexity. Communities of distributed autonomous computing entities that are capable of collaborating, negotiating and acting to solve complex organisational and system management problems are an attractive proposition. Central to this is the requirement for agents to possess the capability of interacting with one another in a structured, consistent and organised manner.This thesis presents the Agent Conversation Reasoning Engine (ACRE), which constitutes a holistic view of communication management for MASs. ACRE is intended to facilitate the practical development, debugging and deployment of communication-heavy MASs.ACRE has been formally defined in terms of its operational semantics, and a generic architecture has been proposed to facilitate its integration with a wide variety of diverse agent development frameworks and Agent Oriented Programming (AOP) languages. A concrete implementation has also been developed that uses the Agent Factory AOP framework as its base. This allows ACRE to be used with a number of different AOP languages, while providing a reference implementation that other integrations can be modelled upon. A standard is also proposed for the modelling and sharing of agent-focused interaction protocols that is independent of the platform within which a concrete ACRE implementation is run.Finally, a user evaluation illustrates the benefits of incorporating conversation management into agent programming. 352
- PublicationStudy of Peer-to-Peer Network Based Cybercrime Investigation: Application on Botnet Technologies(University College Dublin. School of Computer Science & Informatics , 2013)The scalable, low overhead attributes of Peer-to-Peer (P2P) Internet protocols and networks lend themselves well to being exploited by criminals to execute a large range of cybercrimes. The types of crimes aided by P2P technology include copyright infringement, sharing of illicit images of children, fraud, hacking/cracking, denial of service attacks and virus/malware propagation through the use of a variety of worms, botnets, malware, viruses and P2P file sharing. This project is focused on study of active P2P nodes along with the analysis of the undocumented communication methods employed in many of these large unstructured networks. This is achieved through the design and implementation of an efficient P2P monitoring and crawling toolset.The requirement for investigating P2P based systems is not limited to the more obvious cybercrimes listed above, as many legitimate P2P based applications may also be pertinent to a digital forensic investigation, e.g, voice over IP, instant messaging, etc. Investigating these networks has become increasingly difficult due to the broad range of network topologies and the ever increasing and evolving range of P2P based applications. In this work we introduce the Universal P2P Network Investigation Framework (UP2PNIF), a framework which enables significantly faster and less labour intensive investigation of newly discovered P2P networks through the exploitation of the commonalities in P2P network functionality. In combination with a reference database of known network characteristics, it is envisioned that any known P2P network can be instantly investigated using the framework, which can intelligently determine the best investigation methodology and greatly expedite the evidence gathering process. A proof of concept tool was developed for conducting investigations on the BitTorrent network. A Number of investigations conducted using this tool are outlined in Chapter 6.
- PublicationApplying natural language processing to clinical information retrieval(University College Dublin. School of Computer Science and Informatics, 2014)Medical literature, such as medical health records are increasingly digitised.As with any large growth of digital data, methods must be developed to managedata as well as to extract any important information. Information Retrieval(IR) techniques, for instance search engines, provide an intuitive medium inlocating important information among large volumes of data. With more andmore patient records being digitised, the use of search engines in a healthcaresetting provides a highly promising method for efficiently overcomingthe problem of information overload.Traditional IR approaches often perform retrieval based solely using term frequencycounts, known as a `bag-of-words' approach. While these approachesare effective in certain settings they fail to account for more complex semanticrelationships that are often more prevalent in medical literature such as negation(e.g. `absence of palpitations'), temporality (e.g. `previous admissionfor fracture') or attribution (e.g. `Father is diabetic'), or even term dependencies("colon cancer"). Furthermore, the high level of linguistic variation andsynonymy found in clinical reports gives rise to issues of vocabulary mismatchwhereby concepts in a document and query may be the same, however givendifferences in their textual representation relevant documents are missed e.g.hypertension and HNT. Given the high cost associated with errors in the medicaldomain, precise retrieval and reduction of errors is imperative.Given the growing number of shared tasks in the domain of Clinical NaturalLanguage Processing (NLP), this thesis investigates how best to integrate ClinicalNLP technologies into a Clinical Information Retrieval workflow in orderto enhance the search engine experience of healthcare professionals. To determinethis we apply three current directions in Clinical NLP research to theretrieval task. First, we integrate a Medical Entity Recognition system, developedand evaluated on I2B2 datasets, achieving an f-score of 0.85. Thesecond technique clarifies the Assertion Status of medical conditions by determiningwho is the actual experiencer of the medical condition in the report,its negation and its temporality. Standalone evaluations on I2B2 datasets, haveseen the system achieve a micro f-score of 0.91. The final NLP technique appliedis that of Concept Normalisation, whereby textual concepts are mappedto concepts in an ontology in order to avoid problems of vocabulary mismatch.While evaluation scores on the CLEF evaluation corpus are 0.509, this conceptnormalisation approach is shown in the thesis to be the most effective NLPapproach of the three explored in aiding Clinical IR performance.
- PublicationA model of collaboration-based reputation for social recommender systems(University College Dublin. School of Computer Science and Informatics, 2014)Today's online world is one full of rich interactions between its users. In the early days of the web, activity was almost exclusively solitary, now however, users regularly collaborate with one another, often mediated by a piece of content or service. In offline communities, continued good behaviour and long-term relationship building leads naturally to good reputation, however online users often remain anonymous to their community and so trust building can be difficult to foster among community members. As such there has arisen a need for the system to calculate user reputation itself. Now, online reputation systems provide a variety of benefits to the platforms that employ them such as an incentive mechanism for good behaviour and improving the robustness of the platform. However, these systems are often based on ad-hoc activity metrics, and thus do not generalise to multiple platforms or different tasks.In this thesis we introduce a novel approach to capturing and harnessing online reputation. Our approach is to develop a computational model of reputation that is based on the various types of collaboration events that naturally occur in many different types of online social platforms. We describe how a graph-based representation of these collaboration events can be used to aggregate reputation at the user-level and we evaluate a variety of different aggregation strategies. Further, we show how the availability of this type of user reputation can be used to influence traditional recommender systems by combining relevance and reputation at recommendation time. A major part of our evaluation involves integration with the HeyStaks social search system, testing our approach on real-user data from the service.
- PublicationRecommending user connections by utilising the real-time Web(University College Dublin. School of Computer Science and Informatics, 2014)
;Social media services, such as Facebook and Twitter, thrive on user engagement around the active sharing and passive consumption of content. Many of these services have become an important way to discover relevant and interesting information in a timely manner. But to make the most of this aspect of these services it is important that users can locate and follow the most useful producers of relevant content. As these services have continued to grow rapidly this has become more and more of a challenge, especially for new users. This problem can be solved in principle by constructing a recommendation system based on a model of users' preferences and interests to recommend new users worth following.In this thesis we propose a recommendation framework for friend finding. It is capable of integrating different sources of user preference information that is available through services such as Twitter and related services. It is also designed to provide a natural partitioning of user interests based on those topics that are core to the user versus those that are more peripheral and the social connections linked with the user. This provides access to a range of different types of recommendation strategies that may be more helpful in focusing the search for relevant users according to different types of user interests. We demonstrate the effectiveness of our approach by evaluating recommendation quality across large sets of real-world users. 404
- PublicationReal-time monitoring and validation of waste transportation using intelligent agents and pattern recognition(University College Dublin. School of Computer Science and Informatics, 2015)
; ;Within Ireland and other Organisation for Economic Co-operation and Development countries there has been a growing problem of unauthorised waste activity. A report on this activity highlighted a number of problems. Of these, unauthorised collection and fly-tipping of waste is of particular concern due to the potential to cause pollution and health problems.This thesis presents the Waste Augmentation and Integrated Shipment Tracking (WAIST) system. WAIST utilises technologies from the area of pattern recognition, agent-oriented programming and wireless sensor networks to enable the monitoring and validation of waste transportation in near real-time.As components of the WAIST system, this thesis also introduces and evaluates two technologies. The first is the classification of object state based on accelerometer data, and the second is the use of agent-oriented programming languages as a high level abstraction for reducing ``programmer effort'' when implementing intelligent behaviours within WAIST.Both evaluations show positive results. In the classification component, an accuracy of 95.8% has been achieved in an eight class problem. In the agent component, students completed more tasks when using agents than when using Java. Additionally, subjective feedback highlighted a perception that problems were easier to solve using agents.Finally the WAIST system itself was evaluated over a number of simulated waste shipments based on a number of criteria. The results are very positive for the timeliness of the system, the ability to track stopping locations of the shipment, the accuracy when identifying illegal dumping and the efficient management of energy resources. 1084
- PublicationEvaluation models for different routing protocols in wireless sensor networks(University College Dublin. School of Computer Science , 2015)
; ;This thesis aims to introduce the evaluation parameters of Lifetime, Density, Radius, and Reliability for the applications of wireless sensor networks. A series of simulation results have been obtained for the Single-hop, LEACH and Nearest Closer routing protocols which have been implemented in J-Sim simulation platform. Simulation results have been analyzed and several evaluation models have been proposed respectively. Thus, simulations may not be necessary for the users to choose a suitable routing protocol. 233
- PublicationForensic readiness capability for cloud computing(University College Dublin. School of Computer Science , 2015)
; ;Cloud computing services represent the actual computation delivery to the mostof customer communities. Such services are regulated by a contract called ServiceLevel Agreement (SLA), cosigned between customers and providers. During itsvalidity time several contractual constraints have to be respected by the involvedparties. Due to their popularity, cloud services are enormously used and unfortunatelyalso abused, especially by cyber-criminals. A manner for guaranteeing andenhancing cloud service security is the provisioning of a forensic readiness capabilityto them. Such a capability is responsible to perform some activities aimed toprepare the services for a possible forensic investigation. Sometimes, the crimesare related to some contractual constraint violations without the parties are awareof. Thus, a dedicated forensic readiness capability interacting with cloud servicesand detecting the SLA violations by analysing some cloud log files can guaranteemore control on such contracts. In this dissertation, a formal model aimed torepresent a forensic readiness capability for the cloud that detects contractual violationsis presented, together with a prototype system running on a specific casestudy. 496
- PublicationThe SIXTH Middleware: sensible sensing for the sensor web(University College Dublin. School of Computer Science and Informatics, 2015)
; ;Governments, multinationals, researchers, and enthusiasts are presently weaving the planet’s “electronic skin” (Gross, 1999) via miniature, wireless, low-power sensor technologies. However, the control and interconnection of these diverse heterogeneous devices remains difficult, tedious, and time consuming.The thesis proposes and develops a novel sensor-domain middleware permissive of any data source which espouses flexibility, domain modelling, design patterns, extensibility, and simplicity. This thesis provides an extensive review of the state of the art in middleware for sensor technologies. In doing so, a set of shortcomings is identified which form the basis for a desiderata for future sensor network middleware. In line with these aspirations the SIXTH middleware is designed, implemented, and evaluated thoroughly.The design of SIXTH is true to the domain directly mapping virtual representations to real-world artifacts. The design incorporates the abstractions prevalent in low-level domain middleware such as logical grouping aggregates, and queries. SIXTH advances the state of the art by providing improvements over the form and function of its near neighbours. A concrete implementation has been delivered using OSGi as its basis. This implementation is evaluated through its usage in published case-studies, a survey of the developers utilizing the framework, and through objective code metrics. 3070
- PublicationThe role of unsatisfiable Boolean constraints in lightweight description logics(University College Dublin. School of Computer Science , 2016)Lightweight Description Logics (e.g. EL, EL+ etc.) are commonly used languagesto represent life science ontologies. In such languages ontology classification– the problem of computing all the subsumption relations – is tractableand used to characterize all the classes and properties in any given ontology.Despite the fact that classification is tractable in EL+, axiom pinpointing – theproblem of computing the reasons of an (unintended) subsumption relation –is still worst-case exponential. This thesis proposes state-of-the art SAT-basedaxiom pinpointing methods for the Lightweight Description Logic EL+. Theseaxiom pinpointing methods emanate from the analysis of minimal unsatisfiableboolean constraints using hitting set dualization that is also related toReiter’s model-based diagnosis. Its consequences are significant both in termsof algorithms (i.e., MUS extraction and enumeration methods) as well as understandingaxiom pinpointing in Lightweight Description Logics through theprism of minimal unsatisfiable boolean constraints and its related problems(i.e., hypergraph transversal).
- PublicationPerformance optimisation of clustered java systems(University College Dublin. School of Computer Science , 2016)
; ;Nowadays, clustered environments are commonly used in enterprise-levelapplications to achieve faster response time and higher throughput thansingle machine environments. However, this shift from a monolithic architecture to a distributed one has augmented the complexity of these applications, considerably complicating all activities related to the performance optimisation of such clustered systems. Therefore, automatic techniques are needed to facilitate these performance-related activities, which otherwise would be highly error-prone and time-consuming. This thesis contributes to the area of performance optimisation of clustered systems in Java (a predominant technology at enterprise-level), especially aiming for large-scale environments. This thesis proposes two techniques to solve the problems of efficiently identifying workload-dependent performance issues and efficiently avoiding the performance impacts of major garbage collection, two problems that a typical clustered Java system would likely suffer in large-scale environments. In particular, this thesis introduces an adaptive framework to automate the usage of performance diagnosis tools in the performance testing of clustered systems. The aim is to ease the identification of performance issues by decreasing the effort and expertise needed to effectively use such tools. Additionally, an adaptive GC-aware load balancing strategy is introduced, which leverages on major garbage collection forecasts to decide on the best way to balance the workload across the available nodes. The aim is to improve the performance of a clustered system by avoiding the impacts in the cluster's performance due to the major garbage collection occurring at the individual nodes. Experimental results of applying these techniques to a set of real-life applications are presented, showing the benefits that the techniques bring to a clustered Java system. 193
- PublicationAdapting child-robot interaction to reflect age and gender(University College Dublin. School of Computer Science , 2016)
; ;Research and commercial robots have infiltrated homes, hospitals and schools, becoming attractive and proving impactful for children’s healthcare, therapy, edutainment, and other applications. The focus of this thesis is to investigate a little explored issue of how children’s perception of the robot changes with age, and thus to create such a robot to adapt to these differences. In particular, this research investigates the impact of gender segregation on children’s interactions with a humanoid NAO robot. To this end, a series of experiments was conducted with children aged between 5 and 12 years old. The results suggest that children aged between 9 and 12 years old do not support gender segregation hypothesis with a gendered robot.In order to dynamically adapt to children’s age and gender, a perception module was developed using depth data and a collected depth dataset of 3D body metrics of 428 children aged between 5 and 16 years old. This module is able to successfully determine children’s gender in real-world settings with 60.89% (76.64% offline) accuracy and estimate children’s age with a mean absolute error of only 1.83 (0.77 offline) years. Additionally, a pretend play testbed was designed in order to address the challenges of evaluating child-robot interaction by exploiting the advantages of multi-modal, multi-sensory perception. The pretend play testbed performed successfully at children’s play center, where a humanoid NAO robot was able to dynamically adapt its gender by changing its synthesized voice to match child’s perceived age and gender. By analyzing the free play of children, the results confirm the hypothesis of gender segregation for children aged younger than 8 years old. These findings are important to consider when designing robotic applications for children in order to improve engagement, which is essential for robot’s educational and therapeutic benefits. 537
- PublicationFrom Detection to Discourse: Tracking Events and Communities in Breaking News(University College Dublin. School of Computer Science, 2016-12)Online social networks are now an established part of our reality. People no longer rely solely on traditional media outlets to stay informed. Collectively, acts of citizen journalism have transformed news consumers into producers. Keeping up with the overwhelming volume of user-generated content from social media sources is challenging for even well-resourced news organisations. Filtering the most relevant content, however, is not trivial. Significant demand exists for editorial support systems that enable journalists to work more effectively. Social newsgathering introduces many new challenges to the tasks of detecting and tracking breaking news stories. In detection, substantial volumes of data introduce scalability challenges. When tracking developing stories, approaches developed on static collections of documents often fail to capture important changes in the content or structure of data over time. Furthermore, systems tuned on static collections can perform poorly on new, unseen data. To understand significant events, we must also consider the people and organisations who are generating content related to these events. Newsworthy sources are rarely objective and neutral, and in some cases, purposefully created for disinformation, giving rise to the "fake news" phenomenon. An individual's political ideology will inform and influence their choice of language, especially during significant political events such as elections, protests, and other polarising incidents. This thesis presents techniques developed with the intention of supporting journalists who monitor social media for breaking news. Starting with the curation of newsworthy sources, through to implementing an alert system for breaking news events, tracking the evolution of these stories over time, and finally exploring the language used by different communities to gain insights into the discourse around an event. As well as detecting and tracking significant events, it is of interest to identify the differences in language patterns between groups of people around those events. Distributional semantic language models offer a way to quantify certain aspects of discourse, allowing us to track how different communities use language, thereby revealing their stances on key issues.
- PublicationEnhancing the utility of anonymized data in privacy-preserving data publishing(University College Dublin. School of Computer Science , 2017)The collection, publication, and mining of personal data have become key drivers of innovation and value creation. In this context, it is vital that organizations comply with the pertinent data protection laws to safeguard the privacy of the individuals and prevent the uncontrolled disclosure of their information (especially of sensitive data). However, data anonymization is a time-consuming, error-prone, and complex process that requires a high level of expertise in data privacy and domain knowledge. Otherwise, the quality of the anonymized data and the robustness of its privacy protection would be compromised. This thesis contributes to the area of Privacy-Preserving Data Publishing by proposing a set of techniques that help users to make informed decisions on publishing safe and useful anonymized data, while reducing the expert knowledge and effort required to apply anonymization. In particular, the main contributions of this thesis are: (1) A novel method to evaluate, in an objective, quantifiable, and automatic way, the semantic quality of VGHs for categorical data. By improving the specification of the VGHs, the quality of the anonymized data is also improved. (2) A framework for the automatic construction and multi-dimensional evaluation of VGHs. The aim is to generate VGHs more efficiently and of better quality than when manually done. Moreover, the evaluation of VGHs is enhanced as users can compare VGHs from various perspectives and select the ones that better fit their preferences to drive the anonymization of data. (3) A practical approach for the generation of realistic synthetic datasets which preserves the functional dependencies of the data. The aim is to strengthen the testing of anonymization techniques by broadening the number and diversity of the test scenarios. (4) A conceptual framework that describes a set of relevant elements that underlie the assessment and selection of anonymization algorithms. Also, a systematic comparison and analysis of a set of anonymization algorithms to identify the factors that influence their performance, in order to guide users in the selection of a suitable algorithm.
- PublicationMulti-objective Virtual Machine Reassignment for Large Data Centres(University College Dublin. School of Computer Science , 2017)Data centres are large IT facilities composed of an intricate collection of interconnected and virtualised computers, connected services, and complex service-level agreements. Optimising data centres, often attempted by reassigning virtual machines to servers, is both desirable and challenging. It is desirable as it could save a large amount of money: using servers better would lead to decommissioning unused ones and organising services better would increase reliability and maintenance. It is also challenging as the search space is very large and very constrained, which makes the solutions difficult to find. Moreover, in practice assignments can be evaluated from different perspectives, such as electricity cost, overall reliability, migration overhead and cloud cost. Managers in data centres then make complex decisions and need to manipulate possible solutions favouring different objectives to find the right balance. Another element I consider in the context of this work is that organisations hosting large IT facilities are often geographically distributed – which means these organisations are composed of a number of hosting departments which have different preferences on what to host and where to host it, and a certain degree of autonomy. The problem is even more challenging as companies can now choose from a pool of public cloud services to host some of their virtual machines.In this thesis, I address the problem of multi-objective virtual machine (VM) reassignment for large data centres from three realistic and challenging perspectives.• First, I demonstrate how intractable is the exact resolution of the problem in a centralised context: I perform a thorough performance evaluation of classical solvers and metaheuristics, and I propose a novel hybrid algorithm which outperforms them.• Second, I design a two-level system addressing multi-objective VM reassignment for large decentralised data centres. My system takes care of both the reassignment of VMs and their placement within the hosting departments and I propose algorithms that optimise each of the levels.• Third, I extend my work to the hybrid cloud world – i.e., when companies can decide to use their own internal resources or pay for public clouds computing resources. The problem becomes now more dynamic (as prices evolve) and challenging, and I propose a novel algorithm that takes all these elements into account.
- PublicationStudy of Distributed Dynamic Clustering Framework for Spatial Data Mining(University College Dublin. School of Computer Science , 2017)The amount of data generated per year will reach more than 44, 000 billion gigabytes in 2020, ten times more than in 2003 and this is likely to continue according to the current trends. This means more than 10, 000gigabytes per person and per year of data were generated by the daily life. Therefore, the term of "Big Data" was introduced. Big Data refers to very large datasets that are collected from different fields, which heterogeneous and continue to grow at rapid pace. Analysing and extracting relevant information from these datasets is one of the biggest challenges due to their needs to huge storage capacity, processing power, efficient mining algorithms to deal not only with the size but also with heterogeneity, noise, and their learning capacity. These require architectural modifications in the data storage and in the data management, as well as the development of new algorithms for efficient Big Data mining. In fact, the analysis of Big Data requires powerful, scalable, and accurate data analytics techniques that the traditional data mining and machine learning do not have as a whole. Therefore, new data analytics frameworks are needed to deal with the Big Data challenges such as volume, velocity, veracity, variety of the data. Distributed data mining constitutes a promising approach for Big Data analytics, as datasets are usually produced in distributed locations, and processing them on their local sites will reduce significantly the response times, communications, etc. In this thesis, we developed and implemented a data mining framework that can analyse Big Data within a reasonable response time, produce accurate results, and use existing and current computing and storage infrastructure, such as cloud computing. The framework is distributed and deals with issues of high-performance computing. The proposed approach was developed and implemented for spatial data mining. It is general and can handle very large data and deals with data heterogeneity and velocity of the datasets. The approach consists of two phases. The first phase generates local models and the second one tends to aggregate the local results to obtain global models. It is capable of analysing the datasets located in each site using different clustering techniques. The aggregation phase is designed in such a way that the final clusters are compact and accurate while the overall process is efficient in time and memory allocation. The approach was thoroughly tested and compared to well-known clustering algorithms. The results show that the approach not only produces high-quality results compared to the existing approaches but also has super-linear speed-up and scales up very well by taking advantage of theHadoop MapReduce paradigm.
- PublicationEfficient performance testing of Java web applications through workload adaptation(University College Dublin. School of Computer Science, 2019)Performance testing is a critical task to ensure an acceptable user experience with software systems, especially when there are high numbers of concurrent users. Selecting an appropriate test workload is a challenging and time-consuming process that relies heavily on the testers’ expertise. Not only are workloads application-dependent, but it is usually also unclear how large a workload must be to expose any performance issues that exist in an application. Previous research has proposed to dynamically adapt the test workloads in real-time, based on the application’s behavior. Workload adaptation claims to decrease the effort and expertise required to carry out performance testing, by reducing the need for trial-and-error test cycles (which occur when using static workloads). However, such approaches usually require testers to properly configure many parameters. This is cumbersome and hinders the usability and effectiveness of the approach, as a poor configuration, due to the use of inadequate test workloads, could lead to problems being overlooked. To address this problem, this thesis outlines and explains essential steps to conduct efficient performance testing using a dynamic workload adaptation approach, and examines the different factors influencing its performance. This research conducts a comprehensive evaluation of one of such approach to derive insights for practitioners w.r.t. how to fine-tune the process in order to obtain better outcomes based on different scenarios, as well as discuss the effects of varying its configuration, and how this can affect the results obtained. Furthermore, a novel tool was designed to improve the current implementation for dynamic workload adaptation. This tool is built on top of JMeter and aims to help advance research and practice in performance testing, using dynamic workload adaptation.
- PublicationVariance and accuracy in probability estimation from samples : the case of cognitive biases(University College Dublin. School of Computer Science, 2020)A number of recent theories have suggested that the various systematic biases and fallacies seen in people's probabilistic reasoning may arise purely as a consequence of random variation in the reasoning process. The underlying argument, in these theories, is that random variation has systematic regressive effects, so producing the observed patterns of bias. These theories typically take this random variation as a given, and assume that the degree of random variation in probabilistic reasoning is sufficiently large to account for observed patterns of fallacy and bias; there has been very little research directly examining the character of random variation in people's probabilistic judgement. In this thesis, 4 experiments are described that investigate the degree, level, and characteristic properties of random variation in people's probability judgement. They show that the degree of variance is easily large enough to account for the occurrence of two central fallacies in probabilistic reasoning (the conjunction fallacy and the disjunction fallacy), and that level of variance is a reliable predictor of the occurrence of these fallacies. In addition, it is demonstrated that random variance in people's probabilistic judgement follows a particular mathematical model from frequentist probability theory: the binomial proportion distribution. This result supports a model in which people reason about probabilities in a way that follows frequentist probability theory but is subject to random variation or noise.
- PublicationElectromagnetic Side-Channel Analysis Methods for Digital Forensics on Internet of Things(University College Dublin. School of Computer Science, 2020)
;0000-0001-9558-7913Modern legal and corporate investigations heavily rely on the field of digital forensics to uncover vital evidence. The dawn of the Internet of Things (IoT) devices has expanded this horizon by providing new kinds of evidence sources that were not available in traditional digital forensics. However, unlike desktop and laptop computers, the bespoke hardware and software employed on most IoT devices obstructs the use of classical digital forensic evidence acquisition methods. This situation demands alternative approaches to forensically inspect IoT devices. Electromagnetic Side-Channel Analysis (EM-SCA) is a branch in information security that exploits Electromagnetic (EM) radiation of computers to eavesdrop and exfiltrate sensitive information. A multitude of EM-SCA methods have been demonstrated to be effective in attacking computing systems under various circumstances. The objective of this thesis is to explore the potential of leveraging EM-SCA as a forensic evidence acquisition method for IoT devices. Towards this objective, this thesis formulates a model for IoT forensics that uses EM-SCA methods. The design of the proposed model enables the investigators to perform complex forensic insight gathering procedures without having expertise in the field of EM-SCA. In order to demonstrate the function of the proposed model, a proof-of-concept was implemented as an open-source software framework called EMvidence. This framework utilises a modular architecture following a Unix philosophy; where each module is kept minimalist and focused on extracting a specific forensic insight from a specific IoT device. By doing so, the burden of dealing with the diversity of the IoT ecosystem is distributed from a central point into individual modules. Under the proposed model, this thesis presents the design, the implementation, and the evaluation of a collection of methods that can be used to acquire forensic insights from IoT devices using their EM radiation patterns. These forensic insights include detecting cryptography-related events, firmware version, malicious modifications to the firmware, and internal forensic state of the IoT devices. The designed methods utilise supervised Machine Learning(ML) algorithms at their core to automatically identify known patterns of EM radiation with over 90% accuracy. In practice, the forensic inspection of IoT devices using EM-SCA methods may often be conducted during triage examination phase using moderately-resourced computers, such as a laptops carried by the investigator. However, the scale of the EM data generation with fast sample rates and the dimensionality of EM data due to large bandwidths necessitate rich computational resources to process EM datasets. This thesis explores two approaches to reduce such overheads. Firstly, a careful reduction of the sample rate is found to be reducing the generated EM data up to 80%. Secondly, an intelligent channel selection method is presented that drastically reduces the dimensionality of EM data by selecting 500 dimensions out of 20,000. The findings of this thesis paves the way to the noninvasive forensic insight acquisition from IoT devices. With IoT systems increasingly blending into the day-to-day life, the proposed methodology has the potential to become the lifeline of future digital forensic investigations. A multitude of research directions are outlined, which can strengthen this novel approach in the future. 305
- 1 (current)