Now showing 1 - 4 of 4
  • Publication
    Hierarchical Bloom Filter Trees for Approximate Matching
    (Journal of Digital Forensics, Security and Law, 2018-01) ; ;
    Bytewise approximate matching algorithms have in recent years shown significant promise in detecting files that are similar at the byte level. This is very useful for digital forensic investigators, who are regularly faced with the problem of searching through a seized device for pertinent data. A common scenario is where an investigator is in possession of a collection of "known-illegal" files (e.g. a collection of child abuse material) and wishes to find whether copies of these are stored on the seized device. Approximate matching addresses shortcomings in traditional hashing, which can only find identical files, by also being able to deal with cases of merged files, embedded files, partial files, or if a file has been changed in any way. Most approximate matching algorithms work by comparing pairs of files, which is not a scalable approach when faced with large corpora. This paper demonstrates the effectiveness of using a "Hierarchical Bloom Filter Tree" (HBFT) data structure to reduce the running time of collection-against-collection matching, with a specific focus on the MRSH-v2 algorithm. Three experiments are discussed, which explore the effects of different configurations of HBFTs. The proposed approach dramatically reduces the number of pairwise comparisons required, and demonstrates substantial speed gains, while maintaining effectiveness.
      254
  • Publication
    Current Challenges and Future Research Areas for Digital Forensic Investigation
    Given the ever-increasing prevalence of technology in modern life, there is a corresponding increase in the likelihood of digital devices being pertinent to a criminal investigation or civil litigation. As a direct consequence, the number of investigations requiring digital forensic expertise is resulting in huge digital evidence backlogs being encountered by law enforcement agencies throughout the world. It can be anticipated that the number of cases requiring digital forensic analysis will greatly increase in the future. It is also likely that each case will require the analysis of an increasing number of devices including computers, smartphones, tablets, cloud-based services, Internet of Things devices, wearables, etc. The variety of new digital evidence sources poses new and challenging problems for the digital investigator from an identification, acquisition, storage and analysis perspective. This paper explores the current challenges contributing to the backlog in digital forensics from a technical standpoint and outlines a number of future research topics that could greatly contribute to a more efficient digital forensic process.
      547
  • Publication
    Improving Borderline Adulthood Facial Age Estimation through Ensemble Learning
    Achieving high performance for facial age estimation with subjects in the borderline between adulthood and non-adulthood has always been a challenge. Several studies have used different approaches from the age of a baby to an elder adult and different datasets have been employed to measure the mean absolute error (MAE) ranging between 1.47 to 8 years. The weakness of the algorithms specifically in the borderline has been a motivation for this paper. In our approach, we have developed an ensemble technique that improves the accuracy of underage estimation in conjunction with our deep learning model (DS13K) that has been fine-tuned on the Deep Expectation (DEX) model. We have achieved an accuracy of 68% for the age group 16 to 17 years old, which is 4 times better than the DEX accuracy for such age range. We also present an evaluation of existing cloud-based and offline facial age prediction services, such as Amazon Rekognition, Microsoft Azure Cognitive Services, How-Old.net and DEX.
      220ScopusĀ© Citations 10
  • Publication
    EviPlant: An Efficient Digital Forensic Challenge Creation, Manipulation, and Distribution Solution
    (Elsevier, 2017-03-21) ; ;
    Education and training in digital forensics requires a variety of suitable challenge corpora containing realistic features including regular wear-and-tear, background noise, and the actual digital traces to be discovered during investigation. Typically, the creation of these challenges requires overly arduous effort on behalf of the educator to ensure their viability. Once created, the challenge image needs to be stored and distributed to a class for practical training. This storage and distribution step requires significant resources and time and may not even be possible in an online/distance learning scenario due to the data sizes involved. As part of this paper, we introduce a more capable methodology and system to current approaches. EviPlant is a system designed for the efficient creation, manipulation, storage and distribution of challenges for digital forensics education and training. The system relies on the initial distribution of base disk images, i.e., images containing solely bare operating systems. In order to create challenges for students, educators can boot the base system, emulate the desired activity and perform a diffing of resultant image and the base image. This diffing process extracts the modified artefacts and associated metadata and stores them in an evidence package. Evidence packages can be created for different personas, different wear-and-tear, different emulated crimes, etc., and multiple evidence packages can be distributed to students and integrated with the base images. A number of advantages and additional functionality over the current approaches are discussed that emerge as a result of using EviPlant.
      309ScopusĀ© Citations 8