Options
The impact of guide trees in large-scale protein multiple sequence alignments
Author(s)
Advisor(s)
Date Issued
2016
Date Available
2017-08-27T01:00:26Z
Abstract
The focus of this thesis is on large-scale progressive protein multiple sequence alignment algorithms. Although first developed over 30 years ago, multiple sequence alignment algorithms are still an active area of research given their widespread use in many biological analyses, and the dramatic increase in sequence information over the years. The behaviour of the existing algorithms with large numbers of sequences is examined in this work, and in particular the impact of guide trees on the alignments generated.This thesis is divided into 5 chapters. Chapter 1 introduces the concept of a multiple sequence alignment, its uses and how it is constructed. It also details the specifics of progressive alignments, describes how guide trees are constructed, and provides an overview of a number of the ways in which the quality of an alignment can be measured.Chapter 2 examines the impact the topology of the guide tree has on the generated alignment. It finds that simply aligning sequences one after another can produce higher quality alignments than the default alignment methods when measured using structure-based benchmarks. This increase in quality is particularly noticeable with larger alignments. It also finds that randomly ordering the sequences produces aligments with similar quality as any of the other orderings examined.Chapter 3 finds that, because of a tradeoff between alignment accuracy and computation time, larger alignments generated by some of the the most common multiple sequence alignment programs are inherently unstable, and changing the order in which the sequences are listed in the input file will cause a different alignment to be created.Chapter 4 proposes an ordering of the sequences to be aligned that will produce a better quality alignment than the random ordering identified in Chapter 2. It also attempts to resolve the instability issue identified in the previous chapter.Finally, Chapter 5 reviews the findings presented in the thesis, and proposes possible future steps to both use and continue to develop these findings.
Type of Material
Doctoral Thesis
Publisher
University College Dublin. School of Medicine
Qualification Name
Ph.D.
Copyright (Published Version)
2016 the author
Web versions
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
Owning collection
Views
1504
Acquisition Date
Mar 25, 2024
Mar 25, 2024
Downloads
312
Last Week
1
1
Acquisition Date
Mar 25, 2024
Mar 25, 2024