Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information

Files in This Item:
File Description SizeFormat 
Pollastri_2007.pdf349.19 kBAdobe PDFDownload
Title: Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information
Authors: Pollastri, Gianluca
Martin, Alberto J. M.
Mooney, Catherine
Vullo, Alessandro
Permanent link: http://hdl.handle.net/10197/3394
Date: 14-Jun-2007
Abstract: Background : Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio. Results: Here we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available. Conclusion: The predictive system are publicly available at the address http://distill.ucd.ie
Funding Details: Science Foundation Ireland
Irish Research Council for Science, Engineering and Technology
Health Research Board
Type of material: Journal Article
Publisher: BioMed Central
Copyright (published version): 2007 Pollastri et al; licensee BioMed Central Ltd.
Keywords: Secondary structureSolvent accessibilityNeural networksProtein structure prediction
Subject LCSH: Proteins--Structure
Homology (Biology)
Neural networks (Computer science)
DOI: 10.1186/1471-2105-8-201
Language: en
Status of Item: Peer reviewed
Appears in Collections:Computer Science Research Collection
CASL Research Collection

Show full item record

SCOPUSTM   
Citations 5

81
Last Week
2
Last month
checked on Aug 9, 2018

Page view(s) 10

179
checked on May 25, 2018

Download(s) 50

170
checked on May 25, 2018

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons