Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
  • Colleges & Schools
  • Statistics
  • All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. College of Science
  3. School of Computer Science
  4. Computer Science Research Collection
  5. Hierarchical Bloom Filter Trees for Approximate Matching
 
  • Details
Options

Hierarchical Bloom Filter Trees for Approximate Matching

File(s)
FileDescriptionSizeFormat
Download jdfsl.pdf349.54 KB
Author(s)
Lillis, David 
Breitinger, Frank 
Scanlon, Mark 
Uri
http://hdl.handle.net/10197/9694
Date Issued
January 2018
Date Available
26T12:48:51Z March 2019
Abstract
Bytewise approximate matching algorithms have in recent years shown significant promise in detecting files that are similar at the byte level. This is very useful for digital forensic investigators, who are regularly faced with the problem of searching through a seized device for pertinent data. A common scenario is where an investigator is in possession of a collection of "known-illegal" files (e.g. a collection of child abuse material) and wishes to find whether copies of these are stored on the seized device. Approximate matching addresses shortcomings in traditional hashing, which can only find identical files, by also being able to deal with cases of merged files, embedded files, partial files, or if a file has been changed in any way. Most approximate matching algorithms work by comparing pairs of files, which is not a scalable approach when faced with large corpora. This paper demonstrates the effectiveness of using a "Hierarchical Bloom Filter Tree" (HBFT) data structure to reduce the running time of collection-against-collection matching, with a specific focus on the MRSH-v2 algorithm. Three experiments are discussed, which explore the effects of different configurations of HBFTs. The proposed approach dramatically reduces the number of pairwise comparisons required, and demonstrates substantial speed gains, while maintaining effectiveness.
Type of Material
Journal Article
Publisher
Journal of Digital Forensics, Security and Law
Journal
Journal of Association of Digital Forensics, Security and Law
Volume
13
Issue
1
Start Page
80
End Page
96
Copyright (Published Version)
2018 ADFSL
Keywords
  • Approximate matching

  • Hierarchical bloom fi...

  • mrsh-v2

DOI
10.15394/jdfsl.2018.1489
Language
English
Status of Item
Peer reviewed
ISSN
1558-7215
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
Owning collection
Computer Science Research Collection
Views
698
Last Week
1
Last Month
1
Acquisition Date
Mar 27, 2023
View Details
Downloads
254
Last Month
13
Acquisition Date
Mar 27, 2023
View Details
google-scholar
University College Dublin Research Repository UCD
The Library, University College Dublin, Belfield, Dublin 4
Phone: +353 (0)1 716 7583
Fax: +353 (0)1 283 7667
Email: mailto:research.repository@ucd.ie
Guide: http://libguides.ucd.ie/rru

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement