CASL Research Collection
Permanent URI for this collection
Browse
Browsing CASL Research Collection by Title
Now showing 1 - 20 of 148
Results Per Page
Sort Options
- Publication3D Location and Orientation Estimation using Angle of ArrivalThis paper discusses the problem of joint location and orientation estimation of a receiver using only angle of arrival (AOA) information. Conventional formulations of the problem consist of a number of nonlinear equations where the number of unknowns exceeds the number of equations. However, formulations presented in this paper simplifies the problem in a way that leads of efficient solutions. Two solutions are presented and their performance is compared via simulations using an indoor application as an example. Results emphasize the effectiveness of the the proposed methods.
Scopus© Citations 6 418 - PublicationAb initio and homology based prediction of protein domains by recursive neural networks(BioMed Central, 2009-06-26)
; ; ; ; ; Background: Proteins, especially larger ones, are often composed of individual evolutionary units, domains, which have their own function and structural fold. Predicting domains is an important intermediate step in protein analyses, including the prediction of protein structures. Results: We describe novel systems for the prediction of protein domain boundaries powered by Recursive Neural Networks. The systems rely on a combination of primary sequence and evolutionary information, predictions of structural features such as secondary structure, solvent accessibility and residue contact maps, and structural templates, both annotated for domains (from the SCOP dataset) and unannotated (from the PDB). We gauge the contribution of contact maps, and PDB and SCOP templates independently and for different ranges of template quality. We find that accurately predicted contact maps are informative for the prediction of domain boundaries, while the same is not true for contact maps predicted ab initio. We also find that gap information from PDB templates is informative, but, not surprisingly, less than SCOP annotations. We test both systems trained on templates of all qualities, and systems trained only on templates of marginal similarity to the query (less than 25% sequence identity). While the first batch of systems produces near perfect predictions in the presence of fair to good templates, the second batch outperforms or match ab initio predictors down to essentially any level of template quality. We test all systems in 5-fold cross-validation on a large non-redundant set of multi-domain and single domain proteins. The final predictors are state-of-the-art, with a template-less prediction boundary recall of 50.8% (precision 38.7%) within ± 20 residues and a single domain recall of 80.3% (precision 78.1%). The SCOP-based predictors achieve a boundary recall of 74% (precision 77.1%) again within ± 20 residues, and classify single domain proteins as such in over 85% of cases, when we allow a mix of bad and good quality templates. If we only allow marginal templates (max 25% sequence identity to the query) the scores remain high, with boundary recall and precision of 59% and 66.3%, and 80% of all single domain proteins predicted correctly. Conclusion: The systems presented here may prove useful in large-scale annotation of protein domains in proteins of unknown structure. The methods are available as public web servers at the address: http://distill.ucd.ie/shandy/ and we plan on running them on a multi-genomic scale and make the results public in the near future.Scopus© Citations 15 751 - PublicationAb initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks(BioMed Central, 2009-01-30)
; ; ; ; ; Background: Prediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield correct three-dimensional structures, this is true only at a relatively low resolution (3–4 Å from the native structure). Another known weakness of contact maps is that they are generally predicted ab initio, that is not exploiting information about potential homologues of known structure. Results: We introduce a new class of distance restraints for protein structures: multi-class distance maps. We show that C trace reconstructions based on 4-class native maps are significantly better than those from residue contact maps. We then build two predictors of 4-class maps based on recursive neural networks: one ab initio, or relying on the sequence and on evolutionary information; one template-based, or in which homology information to known structures is provided as a further input. We show that virtually any level of sequence similarity to structural templates (down to less than 10%) yields more accurate 4-class maps than the ab initio predictor. We show that template-based predictions by recursive neural networks are consistently better than the best template and than a number of combinations of the best available templates. We also extract binary residue contact maps at an 8 Å threshold (as per CASP assessment) from the 4-class predictors and show that the template-based version is also more accurate than the best template and consistently better than the ab initio one, down to very low levels of sequence identity to structural templates. Furthermore, we test both ab-initio and template-based 8 Å predictions on the CASP7 targets using a pre-CASP7 PDB, and find that both predictors are state-of-the-art, with the template-based one far outperforming the best CASP7 systems if templates with sequence identity to the query of 10% or better are available. Although this is not the main focus of this paper we also report on reconstructions of C traces based on both ab initio and template-based 4-class map predictions, showing that the latter are generally more accurate even when homology is dubious. Conclusion: Accurate predictions of multi-class maps may provide valuable constraints for improved ab initio and template-based prediction of protein structures, naturally incorporate multiple templates, and yield state-of-the- art binary maps. Predictions of protein structures and 8 Å contact maps based on the multi-class distance map predictors described in this paper are freely available to academic users at the url http://distill.ucd.ie/.Scopus© Citations 42 442 - PublicationAb Initio Molecular Dynamics Studies of the Effect of Solvation by Room-Temperature Ionic Liquids on the Vibrational Properties of a N719-Chromophore/Titania InterfaceThe accurate ab initio modeling of prototypical and well-representative photoactive interfaces for candidate dye-sensitized solar cells (DSCs) is a perennial problem in physical chemistry. To this end, the use of ab initio density functional theory-based molecular dynamics (AIMD) has been studied here to investigate the effect the choice of functional has on a system mimicking the essential workings of a DSC: the energetic properties of a [bmim]+[NTf2]- room-temperature ionic liquid (RTIL) solvating an N719-sensitizing dye adsorbed onto an anatase-titania (101) surface were scrutinized. In so doing, we glean important insights into how using an RTIL as electrolytic hole acceptor alters and modulates the dynamical properties of the widely used N719 dye. A fully crossed study has been carried out comparing the Becke-Lee-Yang-Parr (BLYP) and Perdew-Burke-Ernzerhof (PBE) functionals, both unsolvated and solvated by the RTIL, both with and without Grimme D3 dispersion corrections. Also, vibrational spectra for the photoactive interface in the DSC configuration were calculated by means of Fourier-transforming atomic mass-weighted velocity autocorrelation functions. The ab initio vibrational spectra were compared to high-quality experimental data and against each other; the effects of various methodological choices on the vibrational spectra were also studied, with PBE generally performing best in producing spectra, which matched the experimental frequency modes typically expected.
Scopus© Citations 4 446 - PublicationAcceleration of grammatical evolution using graphics processing unitsSeveral papers show that symbolic regression is suitable for data analysis and prediction in financial markets. Grammatical Evolution (GE), a grammar-based form of Genetic Programming (GP), has been successfully applied in solving various tasks including symbolic regression. However, often the computational effort to calculate the fitness of a solution in GP can limit the area of possible application and/or the extent of experimentation undertaken. This paper deals with utilizing mainstream graphics processing units (GPU) for acceleration of GE solving symbolic regression. GPU optimization details are discussed and the NVCC compiler is analyzed. We design an effective mapping of the algorithm to the CUDA framework, and in so doing must tackle constraints of the GPU approach, such as the PCI-express bottleneck and main memory transactions. This is the first occasion GE has been adapted for running on a GPU. We measure our implementation running on one core of CPU Core i7 and GPU GTX 480 together with a GE library written in JAVA, GEVA. Results indicate that our algorithm offers the same con- vergence, and it is suitable for a larger number of regression points where GPU is able to reach speedups of up to 39 times faster when compared to GEVA on a serial CPU code written in C. In conclusion, properly utilized, GPU can offer an interesting performance boost for GE tackling symbolic regression.
Scopus© Citations 12 742 - PublicationAccurate Orientation Estimation Using AHRS under Conditions of Magnetic DistortionLow cost, compact attitude heading reference systems (AHRS) are now being used to track human body movements in indoor environments by estimation of the 3D orientation of body segments. In many of these systems, heading estimation is achieved by monitoring the strength of the Earth’s magnetic field. However, the Earth’s magnetic field can be locally distorted due to the proximity of ferrous and/or magnetic objects. Herein, we propose a novel method for accurate 3D orientation estimation using an AHRS, comprised of an accelerometer, gyroscope and magnetometer, under conditions of magnetic field distortion. The system performs online detection and compensation for magnetic disturbances, due to, for example, the presence of ferrous objects. The magnetic distortions are detected by exploiting variations in magnetic dip angle, relative to the gravity vector, and in magnetic strength. We investigate and show the advantages of using both magnetic strength and magnetic dip angle for detecting the presence of magnetic distortions. The correction method is based on a particle filter, which performs the correction using an adaptive cost function and by adapting the variance during particle resampling, so as to place more emphasis on the results of dead reckoning of the gyroscope measurements and less on the magnetometer readings. The proposed method was tested in an indoor environment in the presence of various magnetic distortions and under various accelerations (up to 3 g). In the experiments, the proposed algorithm achieves <2° static peak-to-peak error and <5° dynamic peak-to-peak error, significantly outperforming previous methods.
439Scopus© Citations 78 - PublicationAccurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information(BioMed Central, 2007-06-14)
; ; ; Background : Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio. Results: Here we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available. Conclusion: The predictive system are publicly available at the address http://distill.ucd.ieScopus© Citations 98 449 - PublicationAdaptive WSN Scheduling for Lifetime Extension in Environmental Monitoring ApplicationsWireless sensor networks (WSNs) are often used for environmental monitoring applications in which nodes periodically measure environmental conditions and immediately send the measurements back to the sink for processing. Since WSN nodes are typically battery powered, network lifetime is a major concern. A key research problem is how to determine the data gathering schedule that will maximize network lifetime while meeting the user's application-specific accuracy requirements. In this work, a novel algorithm for determining efficient sampling schedules for data gathering WSNs is proposed. The algorithm differs from previous work in that it dynamically adapts the sampling schedule based on the observed internode data correlation as well as the temporal correlation. The performance of the algorithm has been assessed using real-world datasets. For two-tier networks, the proposed algorithm outperforms a highly cited previously published algorithm by up to 512% in terms of lifetime and by up to 30% in terms of prediction accuracy. For multihop networks, the proposed algorithm improves on the previously published algorithm by up to 553% and 38% in terms of lifetime and accuracy, respectively.
Scopus© Citations 6 543 - PublicationAggregating Content and Network Information to Curate Twitter User ListsTwitter introduced user lists in late 2009, allowing users to be grouped according to meaningful topics or themes. Lists have since been adopted by media outlets as a means of organising content around news stories. Thus the curation of these lists is important - they should contain the key information gatekeepers and present a balanced perspective on a story. Here we address this list curation process from a recommender systems perspective. We propose a variety of criteria for generating user list recommendations, based on content analysis, network analysis, and the "crowdsourcing" of existing user lists. We demonstrate that these types of criteria are often only successful for datasets with certain characteristics. To resolve this issue, we propose the aggregation of these different "views" of a news story on Twitter to produce more accurate user recommendations to support the curation process.
Scopus© Citations 12 591 - PublicationAn Analysis of Current Trends in CBR Research Using Multi-View Clustering(University College Dublin. School of Computer Science and Informatics, 2009-03)
; ; ; The European Conference on Case-Based Reasoning (CBR) in 2008 marked 15 years of international and European CBR conferences where almost seven hundred research papers were published. In this report we review the research themes covered in these papers and identify the topics that are active at the moment. The main mechanism for this analysis is a clustering of the research papers based on both co-citation links and text similarity. It is interesting to note that the core set of papers has attracted citations from almost three thousand papers outside the conference collection so it is clear that the CBR conferences are a sub-part of a much larger whole. It is remarkable that the research themes revealed by this analysis do not map directly to the sub-topics of CBR that might appear in a textbook. Instead they reflect the applications-oriented focus of CBR research, and cover the promising application areas and research challenges that are faced.114 - PublicationAn analysis of genotype-phenotype maps in grammatical evolution(Springer, 2010)
; ; ; ; We present an analysis of the genotype-phenotype map in Grammatical Evolution (GE). The standard map adopted in GE is a depth-first expansion of the non-terminal symbols during the derivation sequence. Earlier studies have indicated that allowing the path of the expansion to be under the guidance of evolution as opposed to a de- terministic process produced significant performance gains on all of the benchmark problems analysed. In this study we extend this analysis to in- clude a breadth-first and random map, investigate additional benchmark problems, and take into consideration the implications of recent results on alternative grammar representations with this new evidence. We con- clude that it is possible to improve the performance of grammar-based Genetic Programming by the manner in which a genotype-phenotype map is performed.624Scopus© Citations 29 - PublicationBailigh: Low Power Cross-Layer Data Gathering Protocol for Wireless Sensor NetworksData gathering systems are an important class of Wireless Sensor Networks (WSNs). As the goals of such systems are long lifetime, high reliability, and unattended operation, efficient use of limited energy is crucial. This paper presents Bailigh, a low power cross-layer protocol designed for low rate periodic data collection. Bailigh schedules the network to wake up at regular intervals for brief periods of data collection. To achieve this Bailigh integrates synchronous low power listening and network level scheduling. The proposed synchronous low power listening technique mitigates clock drift with low overhead and reduces unnecessary radio usage. Integrated scheduling enables staggered communication, which effectively reduces collisions and increases delivery rate. We use simulations to compare the proposed approach with a non cross-layer approach. Results show an average duty cycle of 0.1% and delivery rate of 95%. With duty cycle 8.7 times lower than LPL, Bailigh offers network lifetime of 5.8 years.
361Scopus© Citations 6 - PublicationBailighPulse: A Low Duty Cycle Data Gathering Protocol For Mostly-Off Wireless Sensor NetworksMostly-off sensor network applications alternate between long periods of inactivity (ranging from minutes to hours) and short periods of activity (normally a few seconds). From an energy consumption point of view, it is desirable that the network switch off completely during application inactive periods and wake-up efficiently at the start of application active periods. The fundamental problem preventing this is the inter-node clock skew arising from the network being off for a long period. Existing solutions maintain synchronization during the inactive period or use the radio excessively to enable asynchronous wake-up. Herein, we propose BailighPulse, a low duty cycle data gathering protocol for mostly-off WSN applications. BailighPulse incorporates a novel multi-hop wake-up scheme that allows for energy efficient recovery of network synchronization after long off periods. The scheme uses a staggered wake-up schedule and optimized channel polling during wake-up based on knowledge of the pre-defined application-level schedule. Herein, we provide an extensive assessment of the protocol’s performance including an analytic model, simulations, and testbed results. We show that, for a homogeneous schedule with collection period greater than 2 min, BailighPulse reduces radio duty cycles by at least 30% and 90% compared to Dozer and B-MAC, respectively. We also show that BailighPulse is able to reduce radio duty cycle by to 68% for a heterogeneous schedule under similar conditions.
Scopus© Citations 10 379 - PublicationA Bayesian hierarchical model for reconstructing relative sea level: from raw data to rates of change(European Geosciences Union, 2016-02-29)
; ; ; We present a Bayesian hierarchical model for reconstructing the continuous and dynamic evolution of relative sea-level (RSL) change with quantified uncertainty. The reconstruction is produced from biological (foraminifera) and geochemical (δ13C) sea-level indicators preserved in dated cores of salt-marsh sediment. Our model is comprised of three modules: (1) a new Bayesian transfer (B-TF) function for the calibration of biological indicators into tidal elevation, which is flexible enough to formally accommodate additional proxies; (2) an existing chronology developed using the Bchron age–depth model, and (3) an existing Errors-In-Variables integrated Gaussian process (EIV-IGP) model for estimating rates of sea-level change. Our approach is illustrated using a case study of Common Era sea-level variability from New Jersey, USA We develop a new B-TF using foraminifera, with and without the additional (δ13C) proxy and compare our results to those from a widely used weighted-averaging transfer function (WA-TF). The formal incorporation of a second proxy into the B-TF model results in smaller vertical uncertainties and improved accuracy for reconstructed RSL. The vertical uncertainty from the multi-proxy B-TF is  ∼  28 % smaller on average compared to the WA-TF. When evaluated against historic tide-gauge measurements, the multi-proxy B-TF most accurately reconstructs the RSL changes observed in the instrumental record (mean square error  =  0.003 m2). The Bayesian hierarchical model provides a single, unifying framework for reconstructing and analyzing sea-level change through time. This approach is suitable for reconstructing other paleoenvironmental variables (e.g., temperature) using biological proxies.596Scopus© Citations 37 - PublicationBeyond the twilight zone : automated prediction of structural properties of proteins by recursive neural networks and remote homology informationThe prediction of 1D structural properties of proteins is an important step toward the prediction of protein structure and function, not only in the ab initio case but also when homology information to known structures is available. Despite this the vast majority of 1D predictors do not incorporate homology information into the prediction process. We develop a novel structural alignment method, SAMD, which we use to build alignments of putative remote homologues that we compress into templates of structural frequency profiles. We use these templates as additional input to ensembles of recursive neural networks, which we specialise for the prediction of query sequences that show only remote homology to any Protein Data Bank structure. We predict four 1D structural properties – secondary structure, relative solvent accessibility, backbone structural motifs, and contact density. Secondary structure prediction accuracy, tested by five-fold cross-validation on a large set of proteins allowing less than 25% sequence identity between training and test set and query sequences and templates, exceeds 82%, outperforming its ab initio counterpart, other state-of-the-art secondary structure predictors (Jpred 3 and PSIPRED) and two other systems based on PSI-BLAST and COMPASS templates. We show that structural information from homologues improves prediction accuracy well beyond the Twilight Zone of sequence similarity, even below 5% sequence identity, for all four structural properties. Significant improvement over the extraction of structural information directly from PDB templates suggests that the combination of sequence and template information is more informative than templates alone.
Scopus© Citations 39 711 - PublicationChange points of global temperatureWe aim to address the question of whether or not there is a significant recent 'hiatus', 'pause' or 'slowdown' of global temperature rise. Using a statistical technique known as change point (CP) analysis we identify the changes in four global temperature records and estimate the rates of temperature rise before and after these changes occur. For each record the results indicate that three CPs are enough to accurately capture the variability in the data with no evidence of any detectable change in the global warming trend since∼1970. We conclude that the term 'hiatus' or 'pause' cannot be statistically justified.
351Scopus© Citations 79 - PublicationCoarse Master Equations for Binding Kinetics of Amyloid Peptide Dimers(ACS, 2016-07)
; ; ; ; We characterize the kinetics of dimer formation of the short amyloid microcrystal-forming tetrapeptides NNQQ by constructing coarse master equations for the conformational dynamics of the system, using temperature replica-exchange molecular dynamics (REMD) simulations. We minimize the effects of Kramers-type recrossings by assigning conformational states based on their sequential time evolution. Transition rates are further estimated from short-time state propagators, by maximizing the likelihood that the extracted rates agree with the observed atomistic trajectories without any a priori assumptions about their temperature dependence. Here, we evaluate the rates for both continuous replica trajectories that visit different temperatures, and for discontinuous data corresponding to each REMD temperature. While the binding-unbinding kinetic process is clearly Markovian, the conformational dynamics of the bound NNQQ dimer has a complex character. Our kinetic analysis allows us a quantitative discrimination between short-lived encounter pairs and strongly bound conformational states. The conformational dynamics of NNQQ dimers supports a kinetically driven aggregation mechanism, in agreement with the polymorphic character reported for amyloid aggregates such as microcrystals and fibrils.Scopus© Citations 26 429 - PublicationCoarse-grained model of adsorption of blood plasma proteins onto nanoparticlesWe present a coarse-grained model for evaluation of interactions of globular proteins with nanoparticles (NPs). The protein molecules are represented by one bead per aminoacid and the nanoparticle by a homogeneous sphere that interacts with the aminoacids via a central force that depends on the nanoparticle size. The proposed methodology is used to predict the adsorption energies for six common human blood plasma proteins on hydrophobic charged or neutral nanoparticles of different sizes as well as the preferred orientation of the molecules upon adsorption. Our approach allows one to rank the proteins by their binding affinity to the nanoparticle, which can be used for predicting the composition of the NP-protein corona. The predicted ranking is in good agreement with known experimental data for proteinadsorption on surfaces.
534Scopus© Citations 60 - PublicationCombining structural analysis and multi-objective criteria for evolutionary architectural design(Springer, 2011)
; ; ; ; ; ; This study evolves and categorises a population of conceptual designs by their ability to handle physical constraints. The design process involves a trade-off between form and function. The aesthetic considerations of the designer are constrained by physical considerations and material cost. In previous work, we developed a design grammar capable of evolving aesthetically pleasing designs through the use of an interactive evolutionary algorithm. This work implements a fitness function capable of applying engineering objectives to automatically evaluate designs and, in turn, reduce the search space that is presented to the user.Scopus© Citations 27 1084 - PublicationCommunity Finding in Large Social Networks Through Problem Decomposition(University College Dublin. School of Computer Science and Informatics, 2008-08)
; ; ; The identification of cohesive communities is a key process in social network analysis. However, the algorithms that are effective for finding communities do not scale well to very large problems, as their time complexity is worse than linear in the number of edges in the graph. This is an important issue for those interested in applying social network analysis techniques to very large networks, such as networks of mobile phone subscribers. In this respect the contributions of this report are two-fold. First we demonstrate these scaling issues using a prominent community-finding algorithm as a case study. We then show that a twostage process, whereby the network is first decomposed into manageable subnetworks using a multilevel graph partitioning procedure, is effective in finding communities in networks with more than 106 nodes.107