Options
Predicting Protein Structural Annotations by Deep and Shallow Learning
Author(s)
Date Issued
2020
Date Available
2022-04-29T14:46:22Z
Abstract
This thesis discusses the prediction of Protein Structural Annotations by Deep and Shallow Learning and the fundamental position of these Annotations in Structural Bioinformatics, and Bioinformatics in general. Proteins are profoundly characterised by their structure in every aspect of their functioning and, while over the last decades there has been a close to exponential growth in the number of known protein sequences, the growth of known protein structures has been closer to linear because of the high complexity and cost of determining them. Thus, Protein Structure Predictors are among the most thoroughly assessed tools in Bioinformatics (in venues such as CASP or CAMEO) because they allow the structural study of proteins on a large scale. This thesis presents the key types of Protein Structural Annotation and various Shallow and Deep Learning methods and algorithms for predicting them. From one-dimensional Protein Annotations – i.e. Secondary Structure, Solvent Accessibility and Torsion Angles – to more complex and informative two-dimensional protein abstractions – i.e. Contact and Distance maps – both mature and currently developing methods for Protein Structure Annotations are introduced. Particular attention is given to some of the best performing and freely available Deep and Shallow Learning methods to predict Protein Structure Annotations that I contributed to develop. In particular, I carried out a very large study of Neural Network-based methods with the following settings: Shallow Learning has been employed with, or without evolutionary information, then more sophisticated approaches have been employed and refined step by step. This led to a robust state-of-the-art pipeline to predict Protein Structural Annotations by Deep Learning. Finally, I used the extensively studied problem of Secondary Structure Prediction to show how the accuracy of state-of-the-art predictors is strongly correlated to the similarity level between training and test profiles extracted from evolutionary information. Based on this study, I propose a protocol to evaluate the accuracy of a predictor at the profile similarity level instead of the standard sequence level.
Type of Material
Doctoral Thesis
Publisher
University College Dublin. School of Computer Science
Qualification Name
Ph.D.
Copyright (Published Version)
2020 the Author
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
Owning collection
Views
230
Last Week
1
1
Last Month
1
1
Acquisition Date
Apr 16, 2024
Apr 16, 2024
Downloads
302
Last Month
4
4
Acquisition Date
Apr 16, 2024
Apr 16, 2024