Optimal Probe Length Varies for Targets with High Sequence Variation: Implications for Probe Library Design for Resequencing Highly Variable Genes

DC FieldValueLanguage
dc.contributor.authorHaslam, Niall J.-
dc.contributor.authorWhiteford, Nava E.-
dc.contributor.authorWeber, Gerald-
dc.contributor.authorPrügel-Bennett, Adam-
dc.contributor.authorEssex, Jonathan W.-
dc.contributor.authorNeylon, Cameron-
dc.date.accessioned2012-12-03T15:01:36Z-
dc.date.available2012-12-03T15:01:36Z-
dc.date.copyright2008 Haslam et alen
dc.date.issued2008-06-
dc.identifier.citationPLoS ONEen
dc.identifier.urihttp://hdl.handle.net/10197/3945-
dc.description.abstractSequencing by hybridisation is an effective method for obtaining large amounts of DNA sequence information at low cost. The efficiency of SBH depends on the design of the probe library to provide the maximum information for minimum cost. Long probes provide a higher probability of non-repeated sequences but lead to an increase in the number of probes required whereas short probes may not provide unique sequence information due to repeated sequences. We have investigated the effect of probe length, use of reference sequences, and thermal filtering on the design of probe libraries for several highly variable target DNA sequences. Results We designed overlapping probe libraries for a range of highly variable drug target genes based on known sequence information and develop a formal terminology to describe probe library design. We find that for some targets these libraries can provide good coverage of a previously unseen target whereas for others the coverage is less than 30%. The optimal probe length varies from as short at 12 nt to as large as 19 nt and depends on the sequence, its variability, and the stringency of thermal filtering. It cannot be determined from inspection of an example gene sequence. Conclusions Optimal probe length and the optimal number of reference sequences used to design a probe library are highly target specific for highly variable sequencing targets. The optimum design cannot be determined simply by inspection of input sequences or of alignments but only by detailed analysis of the each specific target. For highly variable sequences, shorter probes can in some cases provide better information than longer probes. Probe library design would benefit from a general purpose tool for analysing these issues. The formal terminology developed here and the analysis approaches it is used to describe will contribute to the development of such tools.en
dc.language.isoenen
dc.publisherPLOSen
dc.subjectComputational biologyen
dc.subjectGenetics and genomicsen
dc.subject.lcshNucleotide sequenceen
dc.subject.lcshComputational biologyen
dc.subject.lcshGenomics--Methodologyen
dc.titleOptimal Probe Length Varies for Targets with High Sequence Variation: Implications for Probe Library Design for Resequencing Highly Variable Genesen
dc.typeJournal Articleen
dc.internal.authorcontactotherniall.haslam@ucd.ie-
dc.internal.availabilityFull text availableen
dc.statusPeer revieweden
dc.identifier.volume3en
dc.identifier.issue6en
dc.identifier.startpagee2500en
dc.identifier.doi10.1371/journal.pone.0002500-
dc.neeo.contributorHaslam|Niall J.|aut|-
dc.neeo.contributorWhiteford|Nava E.|aut|-
dc.neeo.contributorWeber|Gerald|aut|-
dc.neeo.contributorPrügel-Bennett|Adam|aut|-
dc.neeo.contributorEssex|Jonathan W.|aut|-
dc.neeo.contributorNeylon|Cameron|aut|-
dc.description.othersponsorshipResearch Councils UK Basic Technology Programmeen
dc.description.adminAuthor has checked copyrighten
dc.description.adminDG 09/11/2012en
dc.description.adminNames JGen
dc.internal.rmsid182672616-
dc.date.updated2012-11-07T16:05:47Z-
item.grantfulltextopen-
item.fulltextWith Fulltext-
Appears in Collections:Medicine Research Collection
Files in This Item:
File Description SizeFormat 
Haslam_et_al._-_2008.pdf277.19 kBAdobe PDFDownload
Show simple item record

SCOPUSTM   
Citations 50

2
Last Week
0
Last month
checked on Jan 26, 2020

Page view(s) 50

1,197
Last Week
6
Last month
checked on Jan 27, 2020

Download(s)

101
checked on Jan 27, 2020

Google ScholarTM

Check

Altmetric


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.