Now showing 1 - 10 of 52
Thumbnail Image
Publication

The Case for a Collaborative Universal Peer-to-Peer Botnet Investigation Framework

2014-03-25, Scanlon, Mark, Kechadi, Tahar

Peer to Peer (P2P) botnets are becoming widely used as a low overhead, efficient, self maintaining, distributed alternative to the traditional client/server model across a broad range of cyberattacks. These cyberattacks can take the form of distributed denial of service attacks, authentication cracking, spamming, cyberwarfare or malware distribution targeting on financial systems. These attacks can also cross over into the physical world attacking critical infrastructure causing its disruption or destruction (power, communications, water, etc.). P2P technology lends itself well to being exploited for such malicious purposes due to the minimal setup, running and maintenance costs involved in executing a globally orchestrated attack, alongside the perceived additional layer of anonymity. In the ever evolving space of botnet technology, reducing the time lag between discovering a newly developed or updated botnet system and gaining the ability to mitigate against it is paramount. Often, numerous investigative bodies duplicate their efforts in creating bespoke tools to combat particular threats. This paper outlines a framework capable of fast tracking the investigative process through collaboration between key stakeholders.

Thumbnail Image
Publication

EMvidence: A Framework for Digital Evidence Acquisition from IoT Devices through Electromagnetic Side-Channel Analysis

2020-04, Sayakkara, Asanka P., Le-Khac, Nhien-An, Scanlon, Mark

EM side-channel analysis (EM-SCA) is a branch in information security where the unintentional electromagnetic (EM) emissions from computing devices. This has been used for various purposes including software behaviour detection, software modification detection, malicious software identification, and data extraction. The possibility of applying EM-SCA in digital forensic investigation scenarios involving IoT devices has been proposed recently. When it is difficult or impossible to acquire forensic evidence from an IoT device, observing EM emissions of the device can provide valuable information to an investigator. This work addresses the challenge of making EM-SCA a practical reality to digital forensic investigators by introducing a software framework called EMvidence. The framework is designed to facilitate extensibility through an EM plug-in model.

Thumbnail Image
Publication

Improving Borderline Adulthood Facial Age Estimation through Ensemble Learning

2019-08-26, Anda, Felix, Lillis, David, Kanta, Aikaterini, Becker, Brett A., Bou-Harb, Elias, Le-Khac, Nhien-An, Scanlon, Mark

Achieving high performance for facial age estimation with subjects in the borderline between adulthood and non-adulthood has always been a challenge. Several studies have used different approaches from the age of a baby to an elder adult and different datasets have been employed to measure the mean absolute error (MAE) ranging between 1.47 to 8 years. The weakness of the algorithms specifically in the borderline has been a motivation for this paper. In our approach, we have developed an ensemble technique that improves the accuracy of underage estimation in conjunction with our deep learning model (DS13K) that has been fine-tuned on the Deep Expectation (DEX) model. We have achieved an accuracy of 68% for the age group 16 to 17 years old, which is 4 times better than the DEX accuracy for such age range. We also present an evaluation of existing cloud-based and offline facial age prediction services, such as Amazon Rekognition, Microsoft Azure Cognitive Services, How-Old.net and DEX.

Thumbnail Image
Publication

Expediting MRSH-v2 Approximate Matching with Hierarchical Bloom Filter Trees

2018-01-06, Lillis, David, Breitinger, Frank, Scanlon, Mark

Perhaps the most common task encountered by digital forensic investigators consists of searching through a seized device for pertinent data. Frequently, an investigator will be in possession of a collection of “known-illegal” files (e.g. a collection of child pornographic images) and will seek to find whether copies of these are stored on the seized drive. Traditional hash matching techniques can efficiently find files that precisely match. However, these will fail in the case of merged files, embedded files, partial files, or if a file has been changed in any way. In recent years, approximate matching algorithms have shown significant promise in the detection of files that have a high bytewise similarity. This paper focuses on MRSH-v2. A number of experiments were conducted using Hierarchical Bloom Filter Trees to dramatically reduce the quantity of pairwise comparisons that must be made between known-illegal files and files on the seized disk. The experiments demonstrate substantial speed gains over the original MRSH-v2, while maintaining effectiveness.

Thumbnail Image
Publication

Hierarchical Bloom Filter Trees for Approximate Matching

2018-01, Lillis, David, Breitinger, Frank, Scanlon, Mark

Bytewise approximate matching algorithms have in recent years shown significant promise in detecting files that are similar at the byte level. This is very useful for digital forensic investigators, who are regularly faced with the problem of searching through a seized device for pertinent data. A common scenario is where an investigator is in possession of a collection of "known-illegal" files (e.g. a collection of child abuse material) and wishes to find whether copies of these are stored on the seized device. Approximate matching addresses shortcomings in traditional hashing, which can only find identical files, by also being able to deal with cases of merged files, embedded files, partial files, or if a file has been changed in any way. Most approximate matching algorithms work by comparing pairs of files, which is not a scalable approach when faced with large corpora. This paper demonstrates the effectiveness of using a "Hierarchical Bloom Filter Tree" (HBFT) data structure to reduce the running time of collection-against-collection matching, with a specific focus on the MRSH-v2 algorithm. Three experiments are discussed, which explore the effects of different configurations of HBFTs. The proposed approach dramatically reduces the number of pairwise comparisons required, and demonstrates substantial speed gains, while maintaining effectiveness.

Thumbnail Image
Publication

Enabling non-expert analysis of large volumes of intercepted network traffic

2018-08-30, Wiel, Erwin van de, Scanlon, Mark, Le-Khac, Nhien-An

Telecommunications wiretaps are commonly used by law enforcement in criminal investigations. While phone-based wiretapping has seen considerable success, the same cannot be said for Internet taps. Large portions of intercepted Internet traffic are often encrypted, making it difficult to obtain useful information. The advent of the Internet of Things further complicates network wiretapping. In fact, the current level of complexity of intercepted network traffic is almost at the point where data cannot be analyzed without the active involvement of experts. Additionally, investigations typically focus on analyzing traffic in chronological order and predominately examine the data content of the intercepted traffic. This approach is overly arduous when the amount of data to be analyzed is very large. This chapter describes a novel approach for analyzing large amounts of intercepted network traffic based on traffic metadata. The approach significantly reduces the analysis time and provides useful insights and information to non-technical investigators. The approach is evaluated using a large sample of network traffic data.

Thumbnail Image
Publication

Leveraging Decentralization to Extend the Digital Evidence Acquisition Window: Case Study On Bittorent Sync

2014-09-20, Kechadi, Tahar, Scanlon, Mark, Farina, Jason, Le-Khac, Nhien-An

File synchronization services such as Dropbox, Google Drive, Microsoft OneDrive, Apple iCloud, etc., are becoming increasingly popular in today’s always-connected world. A popular alternative to the aforementioned services is BitTorrent Sync. This is a decentralized/cloudless file synchronization service and is gaining significant popularity among Internet users with privacy concerns over where their data is stored and who has the ability to access it. The focus of this paper is the remote recovery of digital evidence pertaining to files identified as being accessed or stored on a suspect’s computer or mobile device. A methodology for the identification, investigation, recovery and verification of such remote digital evidence is outlined. Finally, a proof-of-concept remote evidence recovery from BitTorrent Sync shared folder highlighting a number of potential scenarios for the recovery and verification of such evidence

Thumbnail Image
Publication

Investigating Cybercrimes that Occur on Documented P2P Networks

2013-09-01, Scanlon, Mark, Hannaway, Alan, Kechadi, Tahar

The popularity of Peer-to-Peer (P2P) Internet communication technologies being exploited to aid cybercrime is ever increasing. P2P systems can be used or exploited to aid in the execution of a large number of online criminal activities, e.g., copyright infringement, fraud, malware and virus distribution, botnet creation, and control. P2P technology is perhaps most famous for the unauthorised distribution of copyrighted materials since the late 1990’s, with the popularity of file-sharing programs such as Napster. In 2004, P2P traffic accounted for 80% of all Internet traffic and in 2005, specifically BitTorrent traffic accounted for over 60% of the world’s P2P bandwidth usage. This paper outlines a methodology for investigating a documented P2P network, BitTorrent, using a sample investigation for reference throughout. The sample investigation outlined was conducted on the top 100 most popular BitTorrent swarms over the course of a one week period.

Thumbnail Image
Publication

Automated Artefact Relevancy Determination from Artefact Metadata and Associated Timeline Events

2020-06-19, Du, Xiaoyu, Le, Quan, Scanlon, Mark

Case-hindering, multi-year digital forensic evidence backlogs have become commonplace in law enforcement agencies throughout the world. This is due to an ever-growing number of cases requiring digital forensic investigation coupled with the growing volume of data to be processed per case. Leveraging previously processed digital forensic cases and their component artefact relevancy classifications can facilitate an opportunity for training automated artificial intelligence based evidence processing systems. These can significantly aid investigators in the discovery and prioritisation of evidence. This paper presents one approach for file artefact relevancy determination building on the growing trend towards a centralised, Digital Forensics as a Service (DFaaS) paradigm. This approach enables the use of previously encountered pertinent files to classify newly discovered files in an investigation. Trained models can aid in the detection of these files during the acquisition stage, i.e., during their upload to a DFaaS system. The technique generates a relevancy score for file similarity using each artefact's filesystem metadata and associated timeline events. The approach presented is validated against three experimental usage scenarios.

Thumbnail Image
Publication

Solid State Drive Forensics: Where Do We Stand?

2019, Vieyra, John, Scanlon, Mark, Le-Khac, Nhien-An

With Solid State Drives (SSDs) becoming more and more prevalent in personal computers, some have suggested that the playing field has changed when it comes to a forensic analysis. Inside the SSD, data movement events occur without any user input. Recent research has suggested that SSDs can no longer be managed in the same manner when performing digital forensic examinations. In performing forensics analysis of SSDs, the events that take place in the background need to be understood and documented by the forensic investigator. These behind the scene processes cannot be stopped with traditional disk write blockers and have now become an acceptable consequence when performing forensic analysis. In this paper, we aim to provide some clear guidance as to what precisely is happening in the background of SSDs during their operation and investigation and also study forensic methods to extract artefacts from SSD under different conditions in terms of volume of data, powered effect, etc. In addition, we evaluate our approach with several experiments across various use-case scenarios.