Now showing 1 - 10 of 18
  • Publication
    A Modified Watermark Synchronisation Code for Robust Embedding of Data in DNA
    DNA data embedding is a newly emerging field aspiring to encode data in deoxyribonucleic acid (DNA). DNA is an inherently digital and noisy medium, undergoing substitution, insertion and deletion mutations. Hence, encoding information in DNA can be seen as a particular case of digital communications in which biological constraints must be observed. In this paper we propose a modification of Davey and MacKay’s watermark synchronisation code (unrelated to digital watermarking) to create an encoding procedure more biocompatible with the host organism than previous methods. In addition, when combined with a low density parity check (LDPC) code, the method provides near-optimum error correction. We also obtain the theoretical embedding capacity of DNA under substitution mutations for the increased biocompatibility constraint. This result, along with an existing bound on capacity for insertion and deletion mutations, is compared to the proposed algorithm’s performance by means of Monte Carlo simulations
      662
  • Publication
    Security of keyed DNA data embedding
    The emergent field of DNA data embedding has, in the past decade, seen the proposal of many methods to encode data in the genome of living organisms. These algorithms have primarily focused on incorporating error correction into encoded data. However the security aspects of data encoded in DNA have generally been overlooked. As with any data hiding method, information encoded in DNA can be protected with a secret key known only to authorized individuals. While there have been suggestions to incorporate keys, no well defined functions for mapping keys to parameters have been proposed yet. Furthermore there have been no security analyses of keyed DNA data embedding methods, in terms of quantifying how difficult it is for an attacker to find the key. In this paper, modifications to incorporate secret keys into a prominent embedding algorithm are proposed, and the security is analysed under realistic attack scenarios. The algorithm to be modified, called BioCode ncDNA has been proposed by us, and is designed for encoding data in vivo (in the cells of living organism) with high biocompatibility and near-optimum embedding rate.
      129
  • Publication
    Gene Tagging and the Data Hiding Rate
    (The Institution of Engineering and Technology, 2012-06-28) ;
    We analyze the maximum number of ways in which one can intrinsically tag a very particular kind of digital asset: a gene, which is just a DNA sequence that encodes a protein. We consider gene tagging under the most relevant biological constraints: protein encoding preservation with and without codon count preservation. We show that our finite and deterministic combinatorial results are asymptotically—as the length of the gene increases— particular cases of the stochastic Gel’fand and Pinsker capacity formula for communications with side information at the encoder, which lies at the foundations of data hiding theory. This is because gene tagging is a particular case of DNA watermarking.
      289
  • Publication
    Optimum Reversible Data Hiding and Permutation Coding
    (IEEE, 2015-11-19)
    This paper is mainly devoted to investigating the connection between binary reversible data hiding and permutation coding. We start by undertaking an approximate combinatorial analysis of the embedding capacity of reversible watermarking in the binary Hamming case, which asymptotically shows that optimum reversible watermarking must involve not only 'writing on dirty paper', as in any blind data hiding scenario, but also writing on the dirtiest parts of the paper. The asymptotic analysis leads to the information-theoretical result given by Kalker and Willems more than a decade ago. Furthermore, the novel viewpoint of the problem suggests a near-optimum reversible watermarking algorithm for the low embedding distortion regime based on permutation coding. A practical implementation of permutation coding, previously proposed in the context of maximum-rate perfect steganography of memoryless hosts, can be used to implement the algorithm. The paper concludes with a discussion on the evaluation of the general rate-distortion bound for reversible data hiding.
      308
  • Publication
    The role of permutation coding in minimum-distortion perfect counterforensics
    (IEEE, 2014-05)
    This paper exploits the connection between minimum-distortion perfect counterforensics and maximum-rate perfect steganography in order to provide the optimum solution to the first of these problems, in the case in which the forensic detector solely uses first order statistics. The solution relies on Slepian’s variant I permutation codes, which had previously been shown to implement maximum rate perfect steganography when the host is memoryless (equivalently, when the steganographic detector only uses first-order statistics). Additionally, we demonstrate a blind counterforensic strategy made possible by permutation decoding, which may also find application in image processing.
      335
  • Publication
    Performance of DNA data embedding algorithms under substitution mutations
    DNA data embedding is a relatively recent area which aims at embedding arbitrary information in deoxyribonucleic acid (DNA) strands. One interesting application of DNA data embedding can be tracing pathways of genetic material in novel ways. This paper explores the decoding performance of several DNA data embedding algorithms proposed in the literature, which are also briefly reviewed. DNA may undergo random mutations, which can cause errors at the decoding stage of such algorithms. Although some proposed methods do account for such errors, decoding performance under mutations has not been previously studied in general. The empirical performance comparison that we provide here allows to fairly compare a number of DNA data embedding algorithms under mutations for the first time. The evaluation is undertaken by means of Monte Carlo simulations. Additionally, we propose two new DNA data embedding algorithms with good robustness properties.
      616
  • Publication
    How to (Possibly) Foil Multimedia Security?
    Multimedia security can be foiled thanks to Slepian’s permutation modulation. Originally proposed in 1965 for standard problems of channel and source coding in communications, permutation codes can also provide optimum solutions in two relevant fields: steganography (foiling hidden information detection tests) and counterforensics (foiling forensic detection tests). In the first scenario, permutation codes have been shown to implement optimum perfect universal steganography (that is to say, steganography with maximum information embedding rate, undetectable and only relying on the empirical distribution of the host) for histogram-based hidden information detectors. In the second scenario, permutation codes have been shown to implement minimum-distortion perfect counterforensics (that is to say, forgeries which are undetectable and as close as possible to a target forgery) for histogram-based forensic detectors. Interestingly, both of these developments have revealed connections with compression through theoretical bounds from the mathematical theory of information. In steganography, the long-acknowledged duality between perfect steganography and lossless compression has been made explicit by permutation coding. On the other hand, a connection between counterforensics, lossy compression and histogram specification is also shown.
      295Scopus© Citations 1
  • Publication
    Permutation Codes and Steganography
    We show that Slepian’s Variant I permutation codes implement first-order perfect steganography (i.e., histogram-preserving steganography). We give theoretical expressions for the embedding distortion, embedding rate and embedding efficiency of permutation codes in steganography, which demonstrate that these codes conform to prior analyses of the properties of capacity-achieving perfect stegosystems with a passive warden. We also propose a modification of adaptive arithmetic coding that near optimally implements permutation coding with a low complexity, confirming all our theoretical predictions. Finally we discuss how to control the embedding distortion. Permutation coding turns out to be akin to Sallee’s model-based steganography, and to supersede both this method and LSB matching.
      476Scopus© Citations 5
  • Publication
    Repetition coding as an effective error correction code for embedding information in DNA
    The goal of DNA data embedding is to enable robust encoding of non-genetic information in DNA. This field straddles the areas of bioinformatics and digital communications, since DNA mutations can be seen as akin to a noisy channel from the point of view of information encoding. In this paper we present two algorithms which, building on a variant of a method proposed by Yachie et al., rely on repetition coding to effectively counteract the impact that mutations have on an embedded message. The algorithms are designed for resynchronising multiple, originally identical, information encoded DNA sequences, embedded within non-coding DNA (ncDNA) sections of a host genome. They use both BLAST and MUSCLE algorithms to accomplish this. Bit error rates at the decoder are established for mutations rates accumulated over a number of generations of the host organism. The empirical results obtained are compared to a theoretical bound for optimal decoding.
      262Scopus© Citations 11
  • Publication
    Towards Optimum Counterforensics of Multiple Significant Digits Using Majorisation-Minimisation
    Optimum counterforensics of the first significant digits entails a forger minimally modifying a forgery in such a way that its first significant digits follow some preselected authentic distribution, e.g., Benford’s law. A solution to this problem based on the simplex algorithm was put forward by Comesa Optimum counterforensics of the first significant digits entails a forger minimally modifying a forgery in such a way that its first significant digits follow some preselected authentic distribution, e.g., Benford’s law. A solution to this problem based on the simplex algorithm was put forward by Comesana and Perez-Gonzalez. However their approach requires scaling up the dimensionality of the original problem. As simplex has exponential worst-case complexity, simplex implementations can struggle to cope with medium to large scale problems. These computational issues get compounded by upscaling the problem dimensionality. Furthermore, Benford’s law applies beyond the first significant digit, but no counterforensics method to date offers a solution to handle an arbitrary number of significant digits. As the use of simplex would only aggravate the computational issues in this case, we propose a more scalable approach to counterforensics of multiple significant digits informed by the Majorisation-Minimisation optimisation philosophy.
      297