Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
    Colleges & Schools
    Statistics
    All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. UCD Theses
  3. College of Science
  4. Mathematics and Statistics Theses
  5. Novel mixture models for bounded, high dimensional and dependent omics data
 
  • Details
Options

Novel mixture models for bounded, high dimensional and dependent omics data

Author(s)
Majumdar, Koyel  
Uri
http://hdl.handle.net/10197/29384
Date Issued
2025
Date Available
2025-10-24T09:00:02Z
Abstract
DNA methylation plays a key role in regulating gene expression and organism development and involves adding or removing a methyl group to cytosine-guanine dinucleotide (CpG) sites. Such aberrant methylation changes can silence tumour suppressor genes. Differential analysis, such as identifying differentially methylated CpG sites (DMCs), differentially methylated regions (DMRs) and differentially expressed genes (DEGs) between conditions like benign and tumour samples, can aid in understanding disease progression. Existing statistical methods for differential analysis of unit-interval bounded methylation data often require transformations that reduce biological interpretability. To address this gap, novel mixture models are proposed that are appropriate for analysing beta distributed, high dimensional methylation data and identification of DMCs. Furthermore, spatial correlation is present among the methylation levels of adjacent CpG sites. A hidden Markov model approach is proposed that accounts for spatial dependencies between adjacent CpG sites, allowing the identification of DMCs and DMRs. Moreover, gene expression and methylation are biologically linked, with methylation affecting gene activation or suppression, yet methylation and gene expression data are typically analysed separately. A joint mixture model is proposed to integrate methylation and gene expression data, simultaneously identifying DMCs and DEGs and providing insight to their dependencies, thus providing a comprehensive understanding of the epigenetic and transcriptional landscape. This thesis presents novel mixture models to analyse differential patterns in high-dimensional and dependent omics data, specifically focusing on epigenetic and transcriptomic data. The proposed methods are evaluated through simulation studies and are applied to publicly available cancer datasets. Additionally, open-source R software packages are available to enable broader use of the proposed models.
Type of Material
Doctoral Thesis
Qualification Name
Doctor of Philosophy (Ph.D.)
Publisher
University College Dublin. School of Mathematics and Statistics
Copyright (Published Version)
2025 the Author
Subjects

Mixture models

Bounded high-dimensio...

Dependent omics data

Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
File(s)
Loading...
Thumbnail Image
Name

Koyel_Majumdar_PhD_thesis_resubmission.pdf

Size

19.41 MB

Format

Adobe PDF

Checksum (MD5)

713e75e4bda909cc1a3144e17509217e

Owning collection
Mathematics and Statistics Theses

Item descriptive metadata is released under a CC-0 (public domain) license: https://creativecommons.org/public-domain/cc0/.
All other content is subject to copyright.

For all queries please contact research.repository@ucd.ie.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement