Now showing 1 - 7 of 7
No Thumbnail Available
Publication

Role Analysis in Networks Using Mixtures of Exponential Random Graph Models

2015, Salter-Townshend, Michael, Murphy, Thomas Brendan

This article introduces a novel and flexible framework for investigating the roles of actors within a network. Particular interest is in roles as defined by local network connectivity patterns, identified using the ego-networks extracted from the network. A mixture of exponential-family random graph models (ERGM) is developed for these ego-networks to cluster the nodes into roles. We refer to this model as the ego-ERGM. An expectation-maximization algorithm is developed to infer the unobserved cluster assignments and to estimate the mixture model parameters using a maximum pseudo-likelihood approximation. We demonstrate the flexibility and utility of the method using examples of simulated and real networks.

No Thumbnail Available
Publication

Review of Statistical Network Analysis: Models, Algorithms, and Software

2012-08, Salter-Townshend, Michael, White, Arthur, Gollini, Isabella, Murphy, Thomas Brendan

The analysis of network data is an area that is rapidly growing, both within and outside of the discipline of statistics. This review provides a concise summary of methods and models used in the statistical analysis of network data, including the Erdos–Renyi model, the exponential family class of network models, and recently developed latent variable models. Many of the methods and models are illustrated by application to the well-known Zachary karate dataset. Software routines available for implementing methods are emphasized throughout. The aim of this paper is to provide a review with enough detail about many common classes of network models to whet the appetite and to point the way to further reading.

No Thumbnail Available
Publication

Sentiment Analysis of Online Media

2012-12-18, Salter-Townshend, Michael, Murphy, Thomas Brendan

A joint model for annotation bias and document classification is presented in the context of media sentiment analysis. We consider an Irish online media data set comprising online news articles with user annotations of negative, positive or irrelevant impact on the Irish economy. The joint model combines a statistical model for user annotation bias and a Naive Bayes model for the document terms. An EM algorithm is used to estimate the annotation bias model, the unobserved biases in the user annotations, the classifier parameters and the sentiment of the articles. The joint modeling of both the user biases and the classifier is demonstrated to be superior to estimation of the bias followed by the estimation of the classifier parameters.

No Thumbnail Available
Publication

Variational Bayesian inference for the Latent Position Cluster Model

2009-12, Salter-Townshend, Michael, Murphy, Thomas Brendan

Many recent approaches to modeling social networks have focussed on embedding the actors in a latent “social space”. Links are more likely for actors that are close in social space than for actors that are distant in social space. In particular, the Latent Position Cluster Model (LPCM) [1] allows for explicit modelling of the clustering that is exhibited in many network datasets. However, inference for the LPCM model via MCMC is cumbersome and scaling of this model to large or even medium size networks with many interacting nodes is a challenge. Variational Bayesian methods offer one solution to this problem. An approximate, closed form posterior is formed, with unknown variational parameters. These parameters are tuned to minimize the Kullback-Leibler divergence between the approximate variational posterior and the true posterior, which known only up to proportionality. The variational Bayesian approach is shown to give a computationally efficient way of fitting the LPCM. The approach is demonstrated on a number of data sets and it is shown to give a good fit.

No Thumbnail Available
Publication

Mixtures of biased sentiment analysers

2013-08-31, Salter-Townshend, Michael, Murphy, Thomas Brendan

Modelling bias is an important consideration when dealing with inexpert annotations. We are concerned with training a classifier to perform sentiment analysis on news media articles, some of which have been manually annotated by volunteers. The classifier is trained on the words in the articles and then applied to non-annotated articles. In previous work we found that a joint estimation of the annotator biases and the classifier parameters performed better than estimation of the biases followed by training of the classifier. An important question follows from this result: can the annotators be usefully clustered into either predetermined or data-driven clusters, based on their biases? If so, such a clustering could be used to select, drop or otherwise categorise the annotators in a crowdsourcing task. This paper presents work on fitting a finite mixture model to the annotators’ bias. We develop a model and an algorithm and demonstrate its properties on simulated data. We then demonstrate the clustering that exists in our motivating dataset, namely the analysis of potentially economically relevant news articles from Irish online news sources.

No Thumbnail Available
Publication

Sentiment analysis of online media

2012, Salter-Townshend, Michael, Murphy, Thomas Brendan

A joint model for annotation bias and document classification is presented in the context of media sentiment analysis. We consider an Irish online media data set comprising online news articles with user annotations of negative, positive or irrelevant impact on the Irish economy. The joint model combines a statistical model for user annotation bias and a Naive Bayes model for the document terms. An EM algorithm is used to estimate the annotation bias model, the unobserved biases in the user annotations, the classifier parameters and the sentiment of the articles. The joint modeling of both the user biases and the classifier is demonstrated to be superior to estimation of the bias followed by the estimation of the classifier parameters.

No Thumbnail Available
Publication

Variational Bayesian inference for the Latent Position Cluster Model for network data

2013-01, Salter-Townshend, Michael, Murphy, Thomas Brendan

A number of recent approaches to modeling social networks have focussed on embedding the nodes in a latent “social space”. Nodes that are in close proximity are more likely to form links than those who are distant. This naturally accounts for reciprocal and transitive relationships which are commonly found in many network datasets. The Latent Position Cluster Model is one such model that also explicitly incorporates clustering by modeling the locations using a finite Gaussian mixture model. Observed covariates and sociality random effects may also be modeled. However, inference for the model via MCMC is cumbersome and thus scaling to large networks is a challenge. Variational Bayesian methods offer an alternative inference methodology for this problem. Sampling based MCMC is replaced by an optimization that requires many orders of magnitude fewer iterations to converge. A Variational Bayesian algorithm for the Latent Position Cluster Model is therefore developed and demonstrated.