  • Publication
    Joint palaeoclimate reconstruction from pollen data via forward models and climate histories
    We present a method and software for reconstructing palaeoclimate from pollen data with a focus on accounting for and reducing uncertainty. The tools we use include: forward models, which enable us to account for the data generating process and hence the complex relationship between pollen and climate; joint inference, which reduces uncertainty by borrowing strength between aspects of climate and slices of the core; and dynamic climate histories, which allow for a far richer gamut of inferential possibilities. Through a Monte Carlo approach we generate numerous equally probable joint climate histories, each of which is represented by a sequence of values of three climate dimensions in discrete time, i.e. a multivariate time series. All histories are consistent with the uncertainties in the forward model and the natural temporal variability in climate. Once generated, these histories can provide most probable climate estimates with uncertainty intervals. This is particularly important as attention moves to the dynamics of past climate changes. For example, such methods allow us to identify, with realistic uncertainty, the past century that exhibited the greatest warming. We illustrate our method with two data sets: Laguna de la Roya, with a radiocarbon dated chronology and hence timing uncertainty; and Lago Grande di Monticchio, which contains laminated sediment and extends back to the penultimate glacial stage. The procedure is made available via an open source R package, Bclim, for which we provide code and instructions.
      742Scopus© Citations 14
  • Publication
    Repeatability analysis of airborne electromagnetic surveys
    Purpose: We provide methods for determining the repeatability of airborne electromagnetic surveys when conducted at different altitudes over a number of repeated flights. Our data arise from the TELLUS project carried out by the Geological Surveys of Ireland and Northern Ireland and we examine the repeatability of the apparent resistivity at different frequencies. Methods: After considering a number of issues with the data, we propose two different models from the functional data analysis literature; a Weiner process with random effects, and a penalised spline smoother. Results: Both methods arrive at the same conclusion regarding repeatability of the data; results obtained are more repeatable for flights at lower altitudes. Conclusions: The target altitude for aircraft carrying out airborne electromagnetic surveys should be as low as possible.
  • Publication
    Prediction of tool-wear in turning of medical grade cobalt chromium molybdenum alloy (ASTM F75) using non-parametric Bayesian models
    We present a novel approach to estimating the effect of control parameters on tool wear rates and related changes in the three force components in turning of medical grade Co-Cr-Mo (ASTM F75) alloy. Co-Cr-Mo is known to be a difficult to cut material which, due to a combination of mechanical and physical properties,is used for the critical structural components of implantable medical prosthetics. We run a designed experiment which enables us to estimate tool wear from feed rate and cutting speed, and constrain them using a Bayesian hierarchical Gaussian Process model which enables prediction of tool wear rates for untried experimental settings. The predicted tool wear rates are non-linear and, using our models,we can identify experimental settings which optimise the life of the tool. This approach has potential in the future for real time application of data analytics to machining processes.
      270Scopus© Citations 16
  • Publication
    GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data
    Background: Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. Results: We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. Conclusions: GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.
      742Scopus© Citations 21
  • Publication
    Exploring stable-based behaviour and behaviour switching for the detection of bilateral pain in equines
    Efficient and sensitive animal pain detection approaches are increasingly studied with the goal of improving animal welfare and monitoring the efficacy of treatment and rehabilitation. The aim of this study was to determine the potential of various behaviours as sensitive indicators of subtle inflammation states in equines. The long-term goal of this research is to understand how to objectively and remotely classify behaviours that are associated with inflammation using wearable inertial sensor technologies. This study represents a proof-of-concept investigation to ascertain what behavioural indices might be important in long-term monitoring of mild bilateral inflammation and recovery with a view to translating the approach to a technology-enabled remote monitoring paradigm. Bilateral synovitis of the intercarpal joints was induced in seven equines using lipopolysaccharide (0.25 ng) at time zero. The horses were confined to stables and monitored intermittently over seven days by stable-fixed video cameras. White blood cell count, carpal circumference and food availability were recorded across the study. An ethogram was created to manually annotate behaviours from video footage following lameness induction at seven different timepoints across a 1-week period. Behaviour data were processed extracting the duration, frequency and variability of behaviours. One-way repeated measures ANOVA revealed a significant time effect for white blood cell count and behaviour switching. There were no significant changes in carpal circumferences and heart rate measures over the sampling period. Food availability appears to be an important contextual factor that should be considered in pain-related behavioural studies. We conclude that behaviour variability may be a promising indicator of subtle bilateral inflammation which should be further explored in larger controlled trials and different pain presentations. Future work will seek to optimise grouping of behaviours associated with inflammation that can be detected using wearable technologies for future remote monitoring protocols.
      15Scopus© Citations 5
  • Publication
    Discussion Paper: A Bayesian multinomial regression model for paleoclimate reconstruction with time uncertainty
    (Wiley, 2016-11)
    In this discussion I hope to highlight some of the real contributions of this paper, to point out some of the important non-statistical considerations (which, as applied statisticians in this area we should be cognisant), and to contrast with the rapidly expanding mostly-Bayesian palaeoclimate statistics literature. In particular, accounting for time uncertainty is, I suspect, almost a unique challenge for time series analysis in palaeoclimate science. For this reason it has been ignored for decades. Now with tools as in this paper, they can start to draw proper inferences on climate over time with suitably quantified uncertainties.
  • Publication
    Change points of global temperature
    We aim to address the question of whether or not there is a significant recent 'hiatus', 'pause' or 'slowdown' of global temperature rise. Using a statistical technique known as change point (CP) analysis we identify the changes in four global temperature records and estimate the rates of temperature rise before and after these changes occur. For each record the results indicate that three CPs are enough to accurately capture the variability in the data with no evidence of any detectable change in the global warming trend since∼1970. We conclude that the term 'hiatus' or 'pause' cannot be statistically justified.
      362Scopus© Citations 79
  • Publication
    Highly variable recurrence of tsunamis in the 7,400 years before the 2004 Indian Ocean tsunami
    The devastating 2004 Indian Ocean tsunami caught millions of coastal residents and the scientific community off-guard. Subsequent research in the Indian Ocean basin has identified prehistoric tsunamis, but the timing and recurrence intervals of such events are uncertain. Here we present an extraordinary 7,400 year stratigraphic sequence of prehistoric tsunami deposits from a coastal cave in Aceh, Indonesia. This record demonstrates that at least 11 prehistoric tsunamis struck the Aceh coast between 7,400 and 2,900 years ago. The average time period between tsunamis is about 450 years with intervals ranging from a long, dormant period of over 2,000 years, to multiple tsunamis within the span of a century. Although there is evidence that the likelihood of another tsunamigenic earthquake in Aceh province is high, these variable recurrence intervals suggest that long dormant periods may follow Sunda megathrust ruptures as large as that of the 2004 Indian Ocean tsunami.
      475Scopus© Citations 105
  • Publication
    A novel method for quantifying overdispersion in count data with application to farmland birds
    The statistical modelling of count data permeates the discipline of ecology. Such data often exhibit overdispersion compared with a standard Poisson distribution, so that the variance of the counts is greater than that of the mean. Whereas modelling to reveal the effects of explanatory variables on the mean is commonplace, overdispersion is generally regarded as a nuisance parameter to be accounted for and subsequently ignored. Instead, we propose a method that models the overdispersion as a biologically interesting property of a data set and show how novel inference is provided as a result. We adapted the double hierarchical generalized linear model approach to create an easily extendible model structure that quantifies the influence of explanatory variables on the overdispersion of count data, and apply it to farmland birds. These data were from a study within Irish agricultural ecosystems, in which total bird species abundance and the abundance of farmland indicator species were compared on dairy and non-dairy farms in the winter and breeding seasons. In general, overdispersion in bird counts was greater on dairy farms than on non-dairy farms, and for total bird numbers, overdispersion was greatest on dairy farms in winter. Our code is fitted using the Bayesian package Rstan, and we make all code and data available in a GitHub repository. Within a Bayesian framework, this approach facilitates a meaningful quantification of the effects of categorical explanatory variables on any response variable with a tendency to overdispersion that has a meaningful biological or ecological explanation.
      341Scopus© Citations 3
  • Publication
    A Bayesian hierarchical model for reconstructing relative sea level: from raw data to rates of change
    We present a Bayesian hierarchical model for reconstructing the continuous and dynamic evolution of relative sea-level (RSL) change with quantified uncertainty. The reconstruction is produced from biological (foraminifera) and geochemical (δ13C) sea-level indicators preserved in dated cores of salt-marsh sediment. Our model is comprised of three modules: (1) a new Bayesian transfer (B-TF) function for the calibration of biological indicators into tidal elevation, which is flexible enough to formally accommodate additional proxies; (2) an existing chronology developed using the Bchron age–depth model, and (3) an existing Errors-In-Variables integrated Gaussian process (EIV-IGP) model for estimating rates of sea-level change. Our approach is illustrated using a case study of Common Era sea-level variability from New Jersey, USA We develop a new B-TF using foraminifera, with and without the additional (δ13C) proxy and compare our results to those from a widely used weighted-averaging transfer function (WA-TF). The formal incorporation of a second proxy into the B-TF model results in smaller vertical uncertainties and improved accuracy for reconstructed RSL. The vertical uncertainty from the multi-proxy B-TF is  ∼  28 % smaller on average compared to the WA-TF. When evaluated against historic tide-gauge measurements, the multi-proxy B-TF most accurately reconstructs the RSL changes observed in the instrumental record (mean square error  =  0.003 m2). The Bayesian hierarchical model provides a single, unifying framework for reconstructing and analyzing sea-level change through time. This approach is suitable for reconstructing other paleoenvironmental variables (e.g., temperature) using biological proxies.
      598Scopus© Citations 39