Options
Modelling Phoneme Similarity in Varieties of English for Human Language Technologies
Author(s)
Date Issued
2024
Date Available
2025-11-14T16:56:18Z
Abstract
At its core, this thesis endeavours to model similarity judgements of phoneme categories in different varieties of English with a focus on simplicity and interpretability. These models can be incorporated into adapted language technologies in order to benefit users with underrepresented spoken varieties and can be examined to provide insight into the pronunciation features of an individual speaker or variety. First, it is shown that speaker judgements of phoneme similarity are not fully predictable based solely on traditional phonological features. Similarity hierarchies of the phonemes of English are constructed from three different phoneme embedding approaches; perception based, feature based, and distribution based. Through qualitative comparison of these three hierarchies it is demonstrated that some elements of speaker perceptions, which were not explainable in terms of phonological features, appear to be influenced by the environments in which phonemes typically occur in English. As a result of this finding, a model of general English phoneme similarity is constructed based on the distributive properties of phonemes and the acoustic properties of their realisations. A practical application of this similarity model is then explored in the form of a spelling correction method. It is demonstrated that a spellchecker based on comparing the phonemic similarities between a misspelling and potential real-word corrections is better suited to the phonetic writing of children than traditional character based systems. This interaction between spoken words and written forms prompts further work investigating how regional pronunciation variation might influence orthographic encoding. An adapted spelling correction tool, developed by fine-tuning the similarity model to Irish Accented English, exhibits better performance on misspellings from Irish school children. Furthermore, the resulting tuned model is interpretable and captures phonetic and phonological features of the English variety. Furthering this investigation into the interplay between spoken variation and phoneme similarity, models of individuals’ English varieties are constructed for speakers from different regions and with different L1s. Erroneous ASR output is leveraged to construct these models by, again, considering similarity as a function of confusability (this time on the part of the recogniser). It is then demonstrated that speakers with similar varieties produce similar representations. Additionally, similarity models at the variety level are analysed to discover which pronunciation features are detected and, as a result, lead to recognition errors and indicate a weakness in the ASR system. The variant pronunciations captured by the models are shown to align with those which arise from human annotations and with existing literature on the specific English varieties. The work presented in this thesis draws from many areas across Computer Science, Linguistics, and Education. These include machine learning, human language technologies, phonetics and phonology, sociolinguistic variation, and children's literacy acquisition. It is hoped that, above all else, this thesis highlights the benefits of multidisciplinary approaches to these topics and will promote collaboration between the fields.
Type of Material
Doctoral Thesis
Qualification Name
Doctor of Philosophy (Ph.D.)
Publisher
University College Dublin. School of Computer Science
Copyright (Published Version)
2024 the Author
Subjects
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
Loading...
Name
EmmaONeillPhDThesisFinal.pdf
Size
3.42 MB
Format
Adobe PDF
Checksum (MD5)
ef05dc3c1d8d3abb198aef91e3ac72cc
Owning collection