Mathematics and Statistics Research Collection

Permanent URI for this collection


Recent Submissions

Now showing 1 - 5 of 303
  • Publication
    Combining biomarker and food intake data
    Recent developments in biomarker discovery have demonstrated that combining biomarkers with self-reported intake data has the potential to improve estimation of food intake. Here, statistical methods for combining biomarker and self-reported food intake data are discussed. The calibration equations method is a widely applied method that corrects for measurement error in self-reported food intake data through the use of biomarker data. The method is outlined and illustrated through an example where citrus intake is estimated. In order to estimate stable calibration equations, a simulation-based framework is delineated which estimates the percentage of study subjects from whom biomarker data is required. The method of triads is frequently used to assess the validity of self-reported food intake data by combining it with biomarker data. The method is outlined and sensitivity to its underlying assumptions is illustrated through simulation studies.
  • Publication
    Inferring food intake from multiple biomarkers using a latent variable model
    (Institute of Mathematical Statistics, 2021-12) ; ;
    Metabolomic based approaches have gained much attention in recent years due to their promising potential to deliver objective tools for assessment of food intake. In particular, multiple biomarkers have emerged for single foods. However, there is a lack of statistical tools available for combining multiple biomarkers to quantitatively infer food intake. Furthermore, there is a paucity of approaches for estimating the uncertainty around biomarker-based inferred intake. Here, to estimate the relationship between multiple metabolomic biomarkers and food intake in an intervention study conducted under the A-DIET research programme, a latent variable model, multiMarker, is proposed. The multiMarker model integrates factor analytic and mixture of experts models: the observed biomarker values are related to intake which is described as a continuous latent variable which follows a flexible mixture of experts model with Gaussian components. The multiMarker model also facilitates inference on the latent intake when only biomarker data are subsequently observed. A Bayesian hierarchical modelling framework provides flexibility to adapt to different biomarker distributions and facilitates inference of the latent intake along with its associated uncertainty. Simulation studies are conducted to assess the performance of the multiMarker model, prior to its application to the motivating application of quantifying apple intake.
      15Scopus© Citations 1
  • Publication
    Numbers of close contacts of individuals infected with SARS-CoV-2 and their association with government intervention strategies
    Background: Contact tracing is conducted with the primary purpose of interrupting transmission from individuals who are likely to be infectious to others. Secondary analyses of data on the numbers of close contacts of confirmed cases could also: provide an early signal of increases in contact patterns that might precede larger than expected case numbers; evaluate the impact of government interventions on the number of contacts of confirmed cases; or provide data information on contact rates between age cohorts for the purpose of epidemiological modelling. We analysed data from 140,204 close contacts of 39,861 cases in Ireland from 1st May to 1st December 2020. Results: Negative binomial regression models highlighted greater numbers of contacts within specific population demographics, after correcting for temporal associations. Separate segmented regression models of the number of cases over time and the average number of contacts per case indicated that a breakpoint indicating a rapid decrease in the number of contacts per case in October 2020 preceded a breakpoint indicating a reduction in the number of cases by 11 days. Conclusions: We found that the number of contacts per infected case was overdispersed, the mean varied considerable over time and was temporally associated with government interventions. Analysis of the reported number of contacts per individual in contact tracing data may be a useful early indicator of changes in behaviour in response to, or indeed despite, government restrictions. This study provides useful information for triangulating assumptions regarding the contact mixing rates between different age cohorts for epidemiological modelling.
  • Publication
    Bayesian Inference, Model Selection and Likelihood Estimation using Fast Rejection Sampling: The Conway-Maxwell-Poisson Distribution
    (International Society for Bayesian Analysis, 2021-09) ;
    Bayesian inference for models with intractable likelihood functions represents a challenging suite of problems in modern statistics. In this work we analyse the Conway-Maxwell-Poisson (COM-Poisson) distribution, a two parameter generalisation of the Poisson distribution. COM-Poisson regression modelling allows the flexibility to model dispersed count data as part of a generalised linear model (GLM) with a COM-Poisson response, where exogenous covariates control the mean and dispersion level of the response. The major difficulty with COM-Poisson regression is that the likelihood function contains multiple intractable normalising constants and is not amenable to standard inference and Markov Chain Monte Carlo (MCMC) techniques. Recent work by Chanialidis et al. (2018) has seen the development of a sampler to draw random variates from the COM-Poisson likelihood using a rejection sampling algorithm. We provide a new rejection sampler for the COM-Poisson distribution which significantly reduces the central processing unit (CPU) time required to perform inference for COM-Poisson regression models. An extension of this work shows that for any intractable likelihood function with an associated rejection sampler it is possible to construct unbiased estimators of the intractable likelihood which proves useful for model selection or for use within pseudo-marginal MCMC algorithms (Andrieu and Roberts, 2009). We demonstrate all of these methods on a real-world dataset of takeover bids.
      70Scopus© Citations 3
  • Publication
    Variational Bayesian inference for the Latent Position Cluster Model for network data
    A number of recent approaches to modeling social networks have focussed on embedding the nodes in a latent “social space”. Nodes that are in close proximity are more likely to form links than those who are distant. This naturally accounts for reciprocal and transitive relationships which are commonly found in many network datasets. The Latent Position Cluster Model is one such model that also explicitly incorporates clustering by modeling the locations using a finite Gaussian mixture model. Observed covariates and sociality random effects may also be modeled. However, inference for the model via MCMC is cumbersome and thus scaling to large networks is a challenge. Variational Bayesian methods offer an alternative inference methodology for this problem. Sampling based MCMC is replaced by an optimization that requires many orders of magnitude fewer iterations to converge. A Variational Bayesian algorithm for the Latent Position Cluster Model is therefore developed and demonstrated.
      124Scopus© Citations 47