Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
    Colleges & Schools
    Statistics
    All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Institutes and Centres
  3. Insight Centre for Data Analytics
  4. Insight Research Collection
  5. Model-based clustering with sparse covariance matrices
 
  • Details
Options

Model-based clustering with sparse covariance matrices

Author(s)
Fop, Michael  
Murphy, Thomas Brendan  
Scrucca, Luca  
Uri
http://hdl.handle.net/10197/11364
Date Issued
2019
Date Available
2020-05-05T14:00:43Z
Abstract
Finite Gaussian mixture models are widely used for model-based clustering of continuous data. Nevertheless, since the number of model parameters scales quadratically with the number of variables, these models can be easily over-parameterized. For this reason, parsimonious models have been developed via covariance matrix decompositions or assuming local independence. However, these remedies do not allow for direct estimation of sparse covariance matrices nor do they take into account that the structure of association among the variables can vary from one cluster to the other. To this end, we introduce mixtures of Gaussian covariance graph models for model-based clustering with sparse covariance matrices. A penalized likelihood approach is employed for estimation and a general penalty term on the graph configurations can be used to induce different levels of sparsity and incorporate prior knowledge. Model estimation is carried out using a structural-EM algorithm for parameters and graph structure estimation, where two alternative strategies based on a genetic algorithm and an efficient stepwise search are proposed for inference. With this approach, sparse component covariance matrices are directly obtained. The framework results in a parsimonious model-based clustering of the data via a flexible model for the within-group joint distribution of the variables. Extensive simulated data experiments and application to illustrative datasets show that the method attains good classification performance and model quality. The general methodology for model-based clustering with sparse covariance matrices is implemented in the R package mixggm, available on CRAN.
Sponsorship
Science Foundation Ireland
Other Sponsorship
Insight Research Centre
Type of Material
Journal Article
Publisher
Springer
Journal
Statistics and Computing
Volume
29
Issue
4
Start Page
791
End Page
819
Copyright (Published Version)
2018 Springer
Subjects

Finite Gaussian mixtu...

Gaussian graphical mo...

Genetic algorithm

Model-based clusterin...

Penalized likelihood

Sparse covariance mat...

Stepwise search

Structural-EM algorit...

DOI
10.1007/s11222-018-9838-y
Language
English
Status of Item
Peer reviewed
ISSN
1575-1375
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
File(s)
Loading...
Thumbnail Image
Name

insight_publication.pdf

Size

1.03 MB

Format

Adobe PDF

Checksum (MD5)

2c797e4b213fe8c9ab08c565348da91d

Owning collection
Insight Research Collection
Mapped collections
Mathematics and Statistics Research Collection

Item descriptive metadata is released under a CC-0 (public domain) license: https://creativecommons.org/public-domain/cc0/.
All other content is subject to copyright.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement