Options
Model-based clustering methods for networks with edge weights and node features
Author(s)
Date Issued
2025
Date Available
2025-10-21T09:05:51Z
Abstract
In this thesis, a number of developments in the area of model-based clustering for network data are described. A novel stochastic block model for clustering nodes in proportion-weighted networks is proposed, which models the vectors of edge weights of the sending nodes via a Dirichlet distribution, with the parameters determined by the cluster membership of the sending and the receiving nodes. The inference is implemented via a variant of the classification expectation-maximisation algorithm for hybrid likelihood, and model selection is addressed via the integrated completed likelihood criterion. An alternative methodology for clustering nodes in composition- weighted networks is also proposed for comparative purposes. The efficacy of the proposed model is showcased though a set of simulation studies, and its practical utility is illustrated on the Erasmus student exchange network and the bike sharing scheme network of London. The aforementioned framework is then adapted to networks exhibiting sparsity, as well as to multiplex networks, to widen the general applicability and utility of the model. The associated inferential procedure based on hybrid log-likelihood (analogous to the original model) is outlined. Model selection is addressed via the integrated completed likelihood criterion as well as an approximated Bayesian information criterion. The performance of the extended model is tested on synthetic data as well as on a Food and Agriculture Organization trade multiplex network. Finally, the thesis proposes a model-based clustering framework for networks with additional node features. This approach jointly models the presence of edges in a network as a function of similarity between the features of the nodes and the node features themselves. Therefore, it does not rely on an independence assumption between the node-level data and the network topology. The proposed model is based on an existing clustering framework, which is improved upon as part of this thesis. In particular, the original model is implemented via a more rigorous procedure, and a model selection criterion is derived. These tools are then used as building blocks for the proposed extended model. The modelling framework is applied to synthetic as well as real-world data to showcase its usefulness.
Type of Material
Doctoral Thesis
Qualification Name
Doctor of Philosophy (Ph.D.)
Publisher
University College Dublin. School of Mathematics and Statistics
Copyright (Published Version)
2025 the Author
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
Loading...
Name
PhD_Thesis_final_Iuliia_Promskaia.pdf
Size
4.41 MB
Format
Adobe PDF
Checksum (MD5)
ab4086b66882c77970fd840cccbfe255
Owning collection