Options
Novel mixture models for bounded, high dimensional and dependent omics data
Author(s)
Date Issued
2025
Date Available
2025-10-24T09:00:02Z
Abstract
DNA methylation plays a key role in regulating gene expression and organism development and involves adding or removing a methyl group to cytosine-guanine dinucleotide (CpG) sites. Such aberrant methylation changes can silence tumour suppressor genes. Differential analysis, such as identifying differentially methylated CpG sites (DMCs), differentially methylated regions (DMRs) and differentially expressed genes (DEGs) between conditions like benign and tumour samples, can aid in understanding disease progression. Existing statistical methods for differential analysis of unit-interval bounded methylation data often require transformations that reduce biological interpretability. To address this gap, novel mixture models are proposed that are appropriate for analysing beta distributed, high dimensional methylation data and identification of DMCs. Furthermore, spatial correlation is present among the methylation levels of adjacent CpG sites. A hidden Markov model approach is proposed that accounts for spatial dependencies between adjacent CpG sites, allowing the identification of DMCs and DMRs. Moreover, gene expression and methylation are biologically linked, with methylation affecting gene activation or suppression, yet methylation and gene expression data are typically analysed separately. A joint mixture model is proposed to integrate methylation and gene expression data, simultaneously identifying DMCs and DEGs and providing insight to their dependencies, thus providing a comprehensive understanding of the epigenetic and transcriptional landscape. This thesis presents novel mixture models to analyse differential patterns in high-dimensional and dependent omics data, specifically focusing on epigenetic and transcriptomic data. The proposed methods are evaluated through simulation studies and are applied to publicly available cancer datasets. Additionally, open-source R software packages are available to enable broader use of the proposed models.
Type of Material
Doctoral Thesis
Qualification Name
Doctor of Philosophy (Ph.D.)
Publisher
University College Dublin. School of Mathematics and Statistics
Copyright (Published Version)
2025 the Author
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
Loading...
Name
Koyel_Majumdar_PhD_thesis_resubmission.pdf
Size
19.41 MB
Format
Adobe PDF
Checksum (MD5)
713e75e4bda909cc1a3144e17509217e
Owning collection