Options
Engineering Forensic-Ready Software Systems Using Automated Logging
Author(s)
Date Issued
2025
Date Available
2025-11-17T11:06:12Z
Abstract
Software is transforming the world by connecting physical processes and controlling many of humankind’s technical and social activities. Organizations worldwide use software to perform business processes and deliver services, while nations rely on it for infrastructure. In this thesis, we focused on software data misuse incidents, where offenders find vulnerabilities that make software perform malicious activities to inappropriately use the data that designers and developers did not expect, such as illegitimate modification of sensitive records. These incidents can involve insiders (e.g., employees, contractors) or outsiders (e.g., criminals, state-sponsored attackers) motivated by financial gain, revenge, or other factors. Software data misuse incidents cannot always be prevented because they often result from improperly using legitimate software functionality. Therefore, software systems need to be "forensic-ready", capable of collecting data that could serve as evidence during a digital investigation. Forensic-ready systems should collect data relevant to the incident (relevance property) while avoiding collecting data unrelated to the incident (minimality property). OWASP identified “Security Logging and Monitoring Failures” as a top security risk, often leaving security breaches undetected for long periods, and making it difficult to reconstruct incidents. Logs generated in software systems can provide information about how a security breach occurred. This thesis aims to assist software developers in building forensic-ready software systems, capable of performing logging functionality to support digital investigations of software data misuse incidents. To achieve this aim, this thesis addresses two research questions: 1. RQ1: How can we elicit security logging requirements that can be traced to software components and method invocations? 2. RQ2: How can we automatically generate logging instructions in software systems that comply with forensic-ready software systems requirements, such as relevance and minimality? To answer RQ1 we propose an approach to automatically generate a model of a security incident by emulating the data misuse of a software system. This model is represented as a sequence diagram, where message exchanges map directly to method invocations. These diagrams elicit security logging requirements and identify the exact location where logging instructions should be placed and the information they should log. To answer RQ2, we use the information provided in the incident model to identify where logging should be implemented and which information should be logged. Finally, we propose an approach to instrument the software system to inject the logging statements at specific points of the software system implementation.
We evaluated the conclusion validity and performance of our approach using two Java software systems: a custom-built human resources management system and a large-scale open-source system. We evaluated conclusion validity by verifying the incident models were automatically generated including software components (e.g., human actors, packages, classes and methods), called when the attack scenario was replayed in the software system. We also assessed that the security logs generated complied with the relevance and the minimality properties. We evaluated the performance by measuring the time to generate the incident models and the logging instructions. Our results show that our approach satisfies relevance because the generated security logs provide methods that identify data misuse activities. It also satisfies minimality because it excludes data irrelevant to the incident. The performance depends on the number of methods in incident models and security logs; however, it also depends on the model image generation and the number of log lines and attributes associated with each logged method.
We evaluated the conclusion validity and performance of our approach using two Java software systems: a custom-built human resources management system and a large-scale open-source system. We evaluated conclusion validity by verifying the incident models were automatically generated including software components (e.g., human actors, packages, classes and methods), called when the attack scenario was replayed in the software system. We also assessed that the security logs generated complied with the relevance and the minimality properties. We evaluated the performance by measuring the time to generate the incident models and the logging instructions. Our results show that our approach satisfies relevance because the generated security logs provide methods that identify data misuse activities. It also satisfies minimality because it excludes data irrelevant to the incident. The performance depends on the number of methods in incident models and security logs; however, it also depends on the model image generation and the number of log lines and attributes associated with each logged method.
Type of Material
Doctoral Thesis
Qualification Name
Doctor of Philosophy (Ph.D.)
Publisher
University College Dublin. School of Computer Science
Copyright (Published Version)
2025 the Author
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
Loading...
Name
Fanny_PhD_Thesis.pdf
Size
10.4 MB
Format
Adobe PDF
Checksum (MD5)
6829dd1bf93c5425851abc282bca0347
Owning collection