Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
  • Colleges & Schools
  • Statistics
  • All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Institutes and Centres
  3. Insight Centre for Data Analytics
  4. Insight Research Collection
  5. Model-based clustering with sparse covariance matrices
 
  • Details
Options

Model-based clustering with sparse covariance matrices

File(s)
FileDescriptionSizeFormat
Download insight_publication.pdf1.03 MB
Author(s)
Fop, Michael 
Murphy, Thomas Brendan 
Scrucca, Luca 
Uri
http://hdl.handle.net/10197/11364
Date Issued
2019
Date Available
05T14:00:43Z May 2020
Abstract
Finite Gaussian mixture models are widely used for model-based clustering of continuous data. Nevertheless, since the number of model parameters scales quadratically with the number of variables, these models can be easily over-parameterized. For this reason, parsimonious models have been developed via covariance matrix decompositions or assuming local independence. However, these remedies do not allow for direct estimation of sparse covariance matrices nor do they take into account that the structure of association among the variables can vary from one cluster to the other. To this end, we introduce mixtures of Gaussian covariance graph models for model-based clustering with sparse covariance matrices. A penalized likelihood approach is employed for estimation and a general penalty term on the graph configurations can be used to induce different levels of sparsity and incorporate prior knowledge. Model estimation is carried out using a structural-EM algorithm for parameters and graph structure estimation, where two alternative strategies based on a genetic algorithm and an efficient stepwise search are proposed for inference. With this approach, sparse component covariance matrices are directly obtained. The framework results in a parsimonious model-based clustering of the data via a flexible model for the within-group joint distribution of the variables. Extensive simulated data experiments and application to illustrative datasets show that the method attains good classification performance and model quality. The general methodology for model-based clustering with sparse covariance matrices is implemented in the R package mixggm, available on CRAN.
Sponsorship
Science Foundation Ireland
Other Sponsorship
Insight Research Centre
Type of Material
Journal Article
Publisher
Springer
Journal
Statistics and Computing
Volume
29
Issue
4
Start Page
791
End Page
819
Copyright (Published Version)
2018 Springer
Keywords
  • Finite Gaussian mixtu...

  • Gaussian graphical mo...

  • Genetic algorithm

  • Model-based clusterin...

  • Penalized likelihood

  • Sparse covariance mat...

  • Stepwise search

  • Structural-EM algorit...

DOI
10.1007/s11222-018-9838-y
Language
English
Status of Item
Peer reviewed
ISSN
1575-1375
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
Owning collection
Insight Research Collection
Scopus© citations
11
Acquisition Date
Feb 1, 2023
View Details
Views
547
Acquisition Date
Feb 1, 2023
View Details
Downloads
228
Last Week
4
Last Month
7
Acquisition Date
Feb 1, 2023
View Details
google-scholar
University College Dublin Research Repository UCD
The Library, University College Dublin, Belfield, Dublin 4
Phone: +353 (0)1 716 7583
Fax: +353 (0)1 283 7667
Email: mailto:research.repository@ucd.ie
Guide: http://libguides.ucd.ie/rru

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement