Now showing 1 - 10 of 15
  • Publication
    Behavioral Service Graphs: A Formal Data-Driven Approach for Prompt Investigation of Enterprise and Internet-wide Infections
    (Elsevier, 2017-03-21) ;
    The task of generating network-based evidence to support network forensic investigation is becoming increasingly prominent. Undoubtedly, such evidence is significantly imperative as it not only can be used to diagnose and respond to various network-related issues (i.e., performance bottlenecks, routing issues, etc.) but more importantly, can be leveraged to infer and further investigate network security intrusions and infections. In this context, this paper proposes a proactive approach that aims at generating accurate and actionable network-based evidence related to groups of compromised network machines (i.e., campaigns). The approach is envisioned to guide investigators to promptly pinpoint such malicious groups for possible immediate mitigation as well as empowering network and digital forensic specialists to further examine those machines using auxiliary collected data or extracted digital artifacts. On one hand, the promptness of the approach is successfully achieved by monitoring and correlating perceived probing activities, which are typically the very first signs of an infection or misdemeanors. On the other hand, the generated evidence is accurate as it is based on an anomaly inference that fuses data behavioral analytics in conjunction with formal graph theoretic concepts. We evaluate the proposed approach in two deployment scenarios, namely, as an enterprise edge engine and as a global capability in a security operations center model. The empirical evaluation that employs 10 GB of real botnet traffic and 80 GB of real darknet traffic indeed demonstrates the accuracy, effectiveness and simplicity of the generated network-based evidence.
      341Scopus© Citations 7
  • Publication
    Digital forensic investigation of two-way radio communication equipment and services
    Historically, radio-equipment has solely been used as a two-way analogue communication device. Today, the use of radio communication equipment is increasing by numerous organisations and businesses. The functionality of these traditionally short-range devices have expanded to include private call, address book, call-logs, text messages, lone worker, telemetry, data communication, and GPS. Many of these devices also integrate with smartphones, which delivers Push-To-Talk services that make it possible to setup connections between users using a two-way radio and a smartphone. In fact, these devices can be used to connect users only using smartphones. To date, there is little research on the digital traces in modern radio communication equipment. In fact, increasing the knowledge base about these radio communication devices and services can be valuable to law enforcement in a police investigation. In this paper, we investigate what kind of radio communication equipment and services law enforcement digital investigators can encounter at a crime scene or in an investigation. Subsequent to seizure of this radio communication equipment we explore the traces, which may have a forensic interest and how these traces can be acquired. Finally, we test our approach on sample radio communication equipment and services.
      16Scopus© Citations 3
  • Publication
    Tiered Forensic Methodology Model for Digital Field Triage by Non-Digital Evidence Specialists
    Due to budgetary constraints and the high level of training required, digital forensic analysts are in short supply in police forces the world over. This inevitably leads to a prolonged time taken between an investigator sending the digital evidence for analysis and receiving the analytical report back. In an attempt to expedite this procedure, various process models have been created to place the forensic analyst in the field conducting a triage of the digital evidence. By conducting triage in the field, an investigator is able to act upon pertinent information quicker, while waiting on the full report. The work presented as part of this paper focuses on the training of front-line personnel in the field triage process, without the need of a forensic analyst attending the scene. The premise has been successfully implemented within regular/non-digital forensics, i.e., crime scene investigation. In that field, front-line members have been trained in specific tasks to supplement the trained specialists. The concept of front-line members conducting triage of digital evidence in the field is achieved through the development of a new process model providing guidance to these members. To prove the model's viability, an implementation of this new process model is presented and evaluated. The results outlined demonstrate how a tiered response involving digital evidence specialists and non-specialists can better deal with the increasing number of investigations involving digital evidence.
      357Scopus© Citations 42
  • Publication
    Deep learning at the shallow end: Malware classification for non-domain experts
    Current malware detection and classification approaches generally rely on time consuming and knowledge intensive processes to extract patterns (signatures) and behaviors from malware, which are then used for identification. Moreover, these signatures are often limited to local, contiguous sequences within the data whilst ignoring their context in relation to each other and throughout the malware file as a whole. We present a Deep Learning based malware classification approach that requires no expert domain knowledge and is based on a purely data driven approach for complex pattern and feature identification.
      67Scopus© Citations 93
  • Publication
    EviPlant: An Efficient Digital Forensic Challenge Creation, Manipulation, and Distribution Solution
    (Elsevier, 2017-03-21) ; ;
    Education and training in digital forensics requires a variety of suitable challenge corpora containing realistic features including regular wear-and-tear, background noise, and the actual digital traces to be discovered during investigation. Typically, the creation of these challenges requires overly arduous effort on behalf of the educator to ensure their viability. Once created, the challenge image needs to be stored and distributed to a class for practical training. This storage and distribution step requires significant resources and time and may not even be possible in an online/distance learning scenario due to the data sizes involved. As part of this paper, we introduce a more capable methodology and system to current approaches. EviPlant is a system designed for the efficient creation, manipulation, storage and distribution of challenges for digital forensics education and training. The system relies on the initial distribution of base disk images, i.e., images containing solely bare operating systems. In order to create challenges for students, educators can boot the base system, emulate the desired activity and perform a diffing of resultant image and the base image. This diffing process extracts the modified artefacts and associated metadata and stores them in an evidence package. Evidence packages can be created for different personas, different wear-and-tear, different emulated crimes, etc., and multiple evidence packages can be distributed to students and integrated with the base images. A number of advantages and additional functionality over the current approaches are discussed that emerge as a result of using EviPlant.
      423Scopus© Citations 14
  • Publication
    BitTorrent Sync: First Impressions and Digital Forensic Implications
    With professional and home Internet users becoming increasingly concerned with data protection and privacy, the privacy afforded by popular cloud file synchronisation services, such as Dropbox, OneDrive and Google Drive, is coming under scrutiny in the press. A number of these services have recently been reported as sharing information with governmental security agencies without warrants. BitTorrent Sync is seen as an alternative by many and has gathered over two million users by December 2013 (doubling since the previous month). The service is completely decentralised, offers much of the same synchronisation functionality of cloud powered services and utilises encryption for data transmission (and optionally for remote storage). The importance of understanding BitTorrent Sync and its resulting digital investigative implications for law enforcement and forensic investigators will be paramount to future investigations. This paper outlines the client application, its detected network traffic and identifies artefacts that may be of value as evidence for future digital investigations.
      406Scopus© Citations 5
  • Publication
    DeepUAge: Improving Underage Age Estimation Accuracy to Aid CSEM Investigation
    Age is a soft biometric trait that can aid law enforcement in the identification of victims of Child Sexual Exploitation Material (CSEM) creation/distribution. Accurate age estimation of subjects can classify explicit content possession as illegal during an investigation. Automation of this age classification has the potential to expedite content discovery and focus the investigation of digital evidence through the prioritisation of evidence containing CSEM. In recent years, artificial intelligence based approaches for automated age estimation have been created, and many public cloud service providers offer this service on their platforms. The accuracy of these algorithms have been improving over recent years. These existing approaches perform satisfactorily for adult subjects, but perform wholly inadequately for underage subjects. To this end, the largest underage facial age dataset, VisAGe, has been used in this work to train a ResNet50 based deep learning model, DeepUAge, that achieved state-of-the-art beating performance for age estimation of minors. This paper describes the design and implementation of this model. An evaluation, validation and comparison of the proposed model is performed against existing facial age classifiers resulting in the best overall performance for underage subjects.
      47Scopus© Citations 15
  • Publication
    Improving the accuracy of automated facial age estimation to aid CSEM investigations
    The investigation of violent crimes against individuals, such as the investigation of child sexual exploitation material (CSEM), is one of the more commonly encountered criminal investigation types throughout the world. While hash lists of known CSEM content are commonly used to identify previously encountered material on suspects’ devices, previously unencountered material requires expert, manual analysis and categorisation. The discovery, analysis, and categorisation of these digital images and videos has the potential to be significantly expedited with the use of automated artificial intelligence (AI) based techniques. Intelligent, automated evidence processing and prioritisation has the potential to aid investigators in alleviating some of the digital evidence backlogs that have become commonplace worldwide. In order for AI-aided CSEM investigations to be beneficial, the fundamental question when analysing multimedia content becomes “how old is each subject encountered?’’. Our work presents the evaluation of existing cloud-based and offline age estimation services, introduces our deep learning model, DS13K, which was created with a VGG-16 Deep Convolutional Neural Network (CNN) architecture, and develops an ensemble technique that improves the accuracy of underage facial age estimation. In addition to our model, a number of existing services including Amazon Rekognition, Microsoft Azure Cognitive Services,, and Deep Expectation (DEX) were used to create an ensemble learning technique. It was found that for the borderline adulthood age range (i.e., 16–17 years old), our DS13K model substantially outperformed existing services, achieving a performance accuracy of 68%. A comparative examination of the obtained results allowed us to identify performance trends and issues inherent to each service/tool and develop ensemble techniques to improve the accuracy of automated adulthood determination.
  • Publication
    Project Maelstrom: Forensic Analysis of the BitTorrent-Powered Browser
    (Association of Digital Forensics, Security and Law, 2015-09) ; ;
    In April 2015, BitTorrent Inc. released their distributed peer-to-peer powered browser, Project Maelstrom, into public beta. The browser facilitates a new alternative website distribution paradigm to the traditional HTTP-based, client-server model. This decentralised web is powered by each of the visitors accessing each Maelstrom hosted website. Each user shares their copy of the websites source code and multimedia content with new visitors. As a result, a Maelstrom hosted website cannot be taken offline by law enforcement or any other parties. Due to this open distribution model, a number of interesting censorship, security and privacy considerations are raised. This paper explores the application, its protocol, sharing Maelstrom content and its new visitor powered 'web-hosting' paradigm.
  • Publication
    Cutting Through the Emissions: Feature Selection from Electromagnetic Side-Channel Data for Activity Detection
    Electromagnetic side-channel analysis (EM-SCA) has been used as a window to eavesdrop on computing devices for information security purposes. It has recently been proposed to use as a digital evidence acquisition method in forensic investigation scenarios as well. The massive amount of data produced by EM signal acquisition devices makes it difficult to process in real-time making on-site EM-SCA infeasible. Uncertainty surrounds the precise information leaking frequency channel demanding the acquisition of signals over a wide bandwidth. As a consequence, investigators are left with a large number of potential frequency channels to be inspected; with many not containing any useful information leakages. The identification of a small subset of frequency channels that leak a sufficient amount of information can significantly boost the performance enabling real-time analysis. This work presents a systematic methodology to identify information leaking frequency channels from high dimensional EM data with the help of multiple filtering techniques and machine learning algorithms. The evaluations show that it is possible to narrow down the number of frequency channels from over 20,000 to less than a hundred (81 channels). The experiments presented show an accuracy of 0.9315 when all the 20,000 channels are used, an accuracy of 0.9395 with the highest 500 channels after calculating the variance between the average value of each class, and an accuracy of 0.9047 when the best 81 channels according to Recursive Feature Elimination are considered.
      25Scopus© Citations 8