Options
Machine learning to discover new antibiotics targeting Klebsiella pneumoniae
Author(s)
Date Issued
2025
Date Available
2026-01-07T12:32:41Z
Embargo end date
2027-08-06
Abstract
The accelerating global threat of antimicrobial resistance (AMR) demands urgent innovation in antibiotic discovery, particularly against multidrug-resistant (MDR) pathogens such as Klebsiella pneumoniae. Traditional antibiotic development pipelines have struggled to keep pace with the emergence of resistance, constrained by high failure rates and slow progress. In this context, data-driven approaches offer promising opportunities to revitalise early-stage discovery. This thesis presents a modular machine learning (ML) framework designed to predict antibacterial activity by integrating compound-level molecular analysis, species-specific modelling and experimental validation, supporting the identification and prioritisation of novel therapeutic candidates. Initially, a feature-based feedforward neural network ensemble model was developed and trained on the heterogeneous ChEMBL database, leveraging a wide spectrum of assay conditions and compound classes. Despite the inherent variability and noise in public datasets, the model achieved robust predictive performance and successfully generalised to external compound libraries. Application of the model to a dataset of clinically tested compounds resulted in the identification of six clinically tested, non-antibiotic compounds with suggested in vitro activity against MDR K. pneumoniae strains in a pilot screening, but this work requires replication. However, while these compounds have already undergone toxicity and safety studies, their potential for drug repurposing remains uncertain. If the minimum inhibitory concentrations (MICs) required for antibacterial activity exceed safe plasma concentrations in humans, they may not be viable candidates. Further validation is therefore needed to assess their clinical relevance and therapeutic potential. Expanding beyond a single pathogen, the work developed and compared species-specific and general models across seven bacterial species, revealing that species-specific approaches improved predictive precision, which may be particularly important in the context of pathogen-specific resistance traits. Nevertheless, the analysis also showed that broad-spectrum models outperformed species-specific models in cases where the species-specific models were trained on relatively small datasets. Further comparative evaluation between classical feature-based models and graph neural networks (GNNs) highlighted their complementary strengths, as while both architectures identified a shared subset of active compounds, each also uncovered distinct hits not found by the other. This suggests that the two approaches capture different aspects of molecular representation and may be best used in combination to maximise compound discovery. While the results underscore the potential of machine learning to substantially augment antibiotic discovery, several limitations were identified, including the reliance on chemical descriptors alone, limited incorporation of bacterial resistance mechanisms, and the absence of pharmacokinetic and pharmacodynamic considerations in candidate prioritisation. In the face of the escalating AMR crisis, interdisciplinary, machine learning-driven approaches such as those proposed here may play an increasing role in sustaining the development of effective antibacterial therapies. Overall, this thesis contributes to a scalable and adaptable computational platform capable of accelerating early-stage antibiotic discovery. By combining species-specific and general models with both feature-based and graph-based models, this ML approach enables a more comprehensive and flexible framework for predicting antibacterial activity. In the face of the escalating antimicrobial resistance (AMR) crisis, such interdisciplinary, machine learning-driven strategies may play an increasingly vital role in sustaining the development of effective antibacterial therapies.
Type of Material
Doctoral Thesis
Qualification Name
Doctor of Philosophy (Ph.D.)
Publisher
University College Dublin. School of Medicine
Copyright (Published Version)
2025 the Author
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
No Thumbnail Available
Name
PhD_VillacampaMarina_corrections.pdf
Size
11.31 MB
Format
Adobe PDF
Checksum (MD5)
cc1976abc18979702fcb8379fc18f038
Owning collection