Now showing 1 - 10 of 55
  • Publication
    Multi-level Attention-Based Neural Networks for Distant Supervised Relation Extraction
    We propose a multi-level attention-based neural network forrelation extraction based on the work of Lin et al. to alleviate the problemof wrong labelling in distant supervision. In this paper, we first adoptgated recurrent units to represent the semantic information. Then, weintroduce a customized multi-level attention mechanism, which is expectedto reduce the weights of noisy words and sentences. Experimentalresults on a real-world dataset show that our model achieves significantimprovement on relation extraction tasks compared to both traditionalfeature-based models and existing neural network-based methods
      206
  • Publication
    Prediction of pathological response to neo‐adjuvant chemoradiotherapy for oesophageal cancer using vibrational spectroscopy
    In oesophageal cancer (OC) neo‐adjuvant chemoradiotherapy (neoCRT) is used to debulk tumour size prior to surgery, with a complete pathological response (pCR) observed in approximately ∼30% of patients. Presently no predictive quantitative methodology exists which can predict response, in particular a pCR or major response (MR), in patients prior to therapy. Raman and Fourier transform infrared imaging were performed on OC tissue specimens acquired from 50 patients prior to therapy, to develop a computational model linking spectral data to treatment outcome. Modelling sensitivities and specificities above 85% were achieved using this approach. Parallel in‐vitro studies using an isogenic model of radioresistant OC supplied further insight into OC cell spectral response to ionising radiation where a potential spectral biomarker of radioresistance was observed at 977 cm−1. This work demonstrates that chemical imaging may provide an option for triage of patients prior to neoCRT treatment allowing more precise prescription of treatment.
      105
  • Publication
    Using Patient Information for the Prediction of Caregiver Burden in Amyotrophic Lateral Sclerosis
    The aim of this study is to create a Clinical Decision Support System (CDSS) to assist in the early identification and support of caregivers at risk of experiencing burden while caring for a person with Amyotrophic Lateral Sclerosis. We work towards a system that uses a minimum amount of data that could be routinely collected. We investigated if the impairment of patients alone provides sufficient information for the prediction of caregiver burden. Results reveal a better performance of our system in identifying those at risk of high burden, but more information is needed for an accurate CDSS.
      148
  • Publication
    Protein Backbone Angle Prediction in Multidimensional φ-ψ Space
    (University College Dublin. School of Computer Science and Informatics, 2006-01-20) ; ;
    A significant step towards establishing the structure and function of a protein is the prediction of the local conformation of the polypeptide chain. In this article we present systems for the prediction of 3 new alphabets of local structural motifs. The motifs are built by applying multidimensional scaling (MDS) and clustering to pair-wise angular distances for multiple φ-ψ angle values collected from high-resolution protein structures. The predictive systems, based on ensembles of bidirectional recurrent neural network architectures, and trained on a large non-redundant set of protein structures, achieve 72%, 66% and 60% correct structural motif prediction on an independent test set for di-peptides (6 classes), tripeptides (8 classes) and tetra-peptides (14 classes), respectively, 28-30% above base-line statistical predictors. To demonstrate that structural motif predictions contain relevant structural information, we build a further system, based on ensembles of two-layered bidirectional recurrent neural networks, to map structural motif predictions into traditional 3-class (helix, strand, coil) secondary structure. This system achieves 79.5% correct prediction using the “hard” CASP 3-class assignment, and 81.4% with a more lenient assignment, outperforming a sophisticated state-of-the-art predictor (Porter) trained in the same experimental conditions. All the predictive systems will be provided free of charge to academic users and made publicly available at the address http://distill.ucd.ie/.
      15
  • Publication
    miRNA-Mediated Regulation of Adult Hippocampal Neurogenesis; Implications for Epilepsy
    Hippocampal neural stem/progenitor cells (NSPCs) proliferate and differentiate to generate new neurons across the life span of most mammals, including humans. This process takes place within a characteristic local microenvironment where NSPCs interact with a variety of other cell types and encounter systemic regulatory factors. Within this microenvironment, cell intrinsic gene expression programs are modulated by cell extrinsic signals through complex interactions, in many cases involving short non-coding RNA molecules, such as miRNAs. Here we review the regulation of gene expression in NSPCs by miRNAs and its possible implications for epilepsy, which has been linked to alterations in adult hippocampal neurogenesis.
      10
  • Publication
    Potential utility of docking to identify protein-peptide binding regions
    (University College Dublin. School of Computer Science and Informatics, 2013-05) ; ; ; ;
    Disordered regions of proteins often bind to structured domains, mediating interactions within and between proteins. However, it is difficult to identify a priori the short regions involved in binding. We set out to determine if docking peptides to peptide binding domains would assist in these predictions. First, we investigated the docking of known short peptides to their native and non-native peptide binding domains. We then investigated the docking of overlapping peptides adjacent to the native peptide. We found only weak discrimination of docking scores between native peptide and adjacent peptides in this context with similar results for both ordered and disordered regions. Finally, we trained a bidirectional recurrent neural network using as input the peptide sequence, predicted secondary structure, Vina docking score and Pepsite score.We conclude that docking has only modest power to define the location of a peptide within a larger protein region known to contain it. However, this information can be used in training machine learning methods which may allow for the identification of peptide binding regions within a protein sequence.
      21
  • Publication
    Prediction of quality of life in people with ALS: on the road towards explainable clinical decision support
    Amyotrophic Lateral Sclerosis (ALS) is a rare neurodegenerative disease that causes a rapid decline in motor functions and has a fatal trajectory. ALS is currently incurable, so the aim of the treatment is mostly to alleviate symptoms and improve quality of life (QoL) for the patients. The goal of this study is to develop a Clinical Decision Support System (CDSS) to alert clinicians when a patient is at risk of experiencing low QoL. The source of data was the Irish ALS Registry and interviews with the 90 patients and their primary informal caregiver at three time-points. In this dataset, there were two different scores to measure a person's overall QoL, based on the McGill QoL (MQoL) Questionnaire and we worked towards the prediction of both. We used Extreme Gradient Boosting (XGBoost) for the development of the predictive models, which was compared to a logistic regression baseline model. Additionally, we used Synthetic Minority Over-sampling Technique (SMOTE) to examine if that would increase model performance and SHAP (SHapley Additive explanations) as a technique to provide local and global explanations to the outputs as well as to select the most important features. The total calculated MQoL score was predicted accurately using three features - age at disease onset, ALSFRS-R score for orthopnoea and the caregiver's status pre-caregiving - with a F1-score on the test set equal to 0.81, recall of 0.78, and precision of 0.84. The addition of two extra features (caregiver's age and the ALSFRS-R score for speech) produced similar outcomes (F1-score 0.79, recall 0.70 and precision 0.90).
      32
  • Publication
    In silico approaches to predict the potential of milk protein-derived peptides as dipeptidyl peptidase IV (DPP-IV) inhibitors
    Molecular docking of a library of all 8000 possible tripeptides to the active site of DPP-IV was used to determine their binding potential. A number of tripeptides were selected for experimental testing, however, there was no direct correlation between the Vina score and their in vitro DPP-IV inhibitory properties. While Trp-Trp-Trp, the peptide with the best docking score, was a moderate DPP-IV inhibitor (IC50 216 μM), Lineweaver and Burk analysis revealed its action to be non-competitive. This suggested that it may not bind to the active site of DPP-IV as assumed in the docking prediction. Furthermore, there was no significant link between DPP-IV inhibition and the physicochemical properties of the peptides (molecular mass, hydrophobicity, hydrophobic moment (μH), isoelectric point (pI) and charge). LIGPLOTs indicated that competitive inhibitory peptides were predicted to have both hydrophobic and hydrogen bond interactions with the active site of DPP-IV. DPP-IV inhibitory peptides generally had a hydrophobic or aromatic amino acid at the N-terminus, preferentially a Trp for non-competitive inhibitors and a broader range of residues for competitive inhibitors (Ile, Leu, Val, Phe, Trp or Tyr). Two of the potent DPP-IV inhibitors, Ile-Pro-Ile and Trp-Pro (IC 50 values of 3.5 and 44.2 μM, respectively), were predicted to be gastrointestinally/intestinally stable. This work highlights the needs to test the assumptions (i.e. competitive binding) of any integrated strategy of computational and experimental screening, in optimizing screening. Future strategies targeting allosteric mechanisms may need to rely more on structure-activity relationship modeling, rather than on docking, in computationally selecting peptides for screening.
      37Scopus© Citations 99
  • Publication
    Ab initio and homology based prediction of protein domains by recursive neural networks
    Background: Proteins, especially larger ones, are often composed of individual evolutionary units, domains, which have their own function and structural fold. Predicting domains is an important intermediate step in protein analyses, including the prediction of protein structures. Results: We describe novel systems for the prediction of protein domain boundaries powered by Recursive Neural Networks. The systems rely on a combination of primary sequence and evolutionary information, predictions of structural features such as secondary structure, solvent accessibility and residue contact maps, and structural templates, both annotated for domains (from the SCOP dataset) and unannotated (from the PDB). We gauge the contribution of contact maps, and PDB and SCOP templates independently and for different ranges of template quality. We find that accurately predicted contact maps are informative for the prediction of domain boundaries, while the same is not true for contact maps predicted ab initio. We also find that gap information from PDB templates is informative, but, not surprisingly, less than SCOP annotations. We test both systems trained on templates of all qualities, and systems trained only on templates of marginal similarity to the query (less than 25% sequence identity). While the first batch of systems produces near perfect predictions in the presence of fair to good templates, the second batch outperforms or match ab initio predictors down to essentially any level of template quality. We test all systems in 5-fold cross-validation on a large non-redundant set of multi-domain and single domain proteins. The final predictors are state-of-the-art, with a template-less prediction boundary recall of 50.8% (precision 38.7%) within ± 20 residues and a single domain recall of 80.3% (precision 78.1%). The SCOP-based predictors achieve a boundary recall of 74% (precision 77.1%) again within ± 20 residues, and classify single domain proteins as such in over 85% of cases, when we allow a mix of bad and good quality templates. If we only allow marginal templates (max 25% sequence identity to the query) the scores remain high, with boundary recall and precision of 59% and 66.3%, and 80% of all single domain proteins predicted correctly. Conclusion: The systems presented here may prove useful in large-scale annotation of protein domains in proteins of unknown structure. The methods are available as public web servers at the address: http://distill.ucd.ie/shandy/ and we plan on running them on a multi-genomic scale and make the results public in the near future.
      668Scopus© Citations 14
  • Publication
    Prediction of caregiver burden in amyotrophic lateral sclerosis: a machine learning approach using random forests applied to a cohort study
    OBJECTIVES:Amyotrophic lateral sclerosis (ALS) is a rare neurodegenerative disease that is characterised by the rapid degeneration of upper and lower motor neurons and has a fatal trajectory 3-4 years from symptom onset. Due to the nature of the condition patients with ALS require the assistance of informal caregivers whose task is demanding and can lead to high feelings of burden. This study aims to predict caregiver burden and identify related features using machine learning techniques. DESIGN:This included demographic and socioeconomic information, quality of life, anxiety and depression questionnaires, for patients and carers, resource use of patients and clinical information. The method used for prediction was the Random forest algorithm. SETTING AND PARTICIPANTS:This study investigates a cohort of 90 patients and their primary caregiver at three different time-points. The patients were attending the National ALS/Motor Neuron Disease Multidisciplinary Clinic at Beaumont Hospital, Dublin. RESULTS:The caregiver's quality of life and psychological distress were the most predictive features of burden (0.92 sensitivity and 0.78 specificity). The most predictive features for Clinical Decision Support model were associated with the weekly caregiving duties of the primary caregiver as well as their age and health and also the patient's physical functioning and age of onset. However, this model had a lower sensitivity and specificity score (0.84 and 0.72, respectively). The ability of patients without gastrostomy to cut food and handle utensils was also highly predictive of burden in this study. Generally, our models are better in predicting the high-risk category, and we suggest that information related to the caregiver's quality of life and psychological distress is required. CONCLUSION:This work demonstrates a proof of concept of an informatics solution to identifying caregivers at risk of burden that could be incorporated into future care pathways.
      117Scopus© Citations 11