Now showing 1 - 1 of 1
  • Publication
    The Embodied Modelling of Gestural Sequencing in Speech
    (University College Dublin. School of Computer Science and Informatics, 2009-08)
    In this work we formulate and examine the hypothesis that speech sequencing patterns are shaped by the requirements of production and perception efficiency. Many phenomena associated with sequencing in speech can be seen as emergent properties of skilled motor action system behaving according to biologically and evolutionarily plausible parsimonious criteria. The articulatory sequences under consideration are represented in this thesis in the manner of Articulatory Phonology as gestural scores, i.e., temporal and dynamical descriptions of activation patterns of primitive speech actions – gestures. In order to be able to introduce relevant optimality criteria, we thoroughly evaluate the Task Dynamical implementation of Articulatory Phonology. We modify the task dynamical model of the vocal tract so that it accounts not only for the dynamical nature of the abstract sequenced tasks but also for the physical properties of the end effectors participating in the production of a small, clearly defined set of speech gestures. This resulting embodied task dynamical implementation of gestural sequencing allows us to quantitatively evaluate gestural scores in terms of the cost of their realisation, encompassing both functional aspects and the underlying physical effort associated with speech production. In this work we consider three inter-linked cost functions associated with the production and perception of gestural sequences: articulatory effort, parsing cost, and overall utterance duration. The articulatory effort is a measure of expenditure of force exerted by the model muscles in order to achieve a given sequence of gestural targets. The parsing cost is related to the resulting precision of articulation as achieved by the system’s end effectors, and to the demands imposed on the listener in order to parse the utterance. The duration cost straightforwardly reflects the overall duration of the realised gestural score. The cost functions conceived in this way pose mutually contradicting requirements on the temporal alignments and dynamical parameterisations of the gestural sequence. When combined in a single parametric overall cost function used as an objective function of the optimisation problem, this approach leads to identification of sequencing patterns, gestural scores, that are optimal with respect to the realistic interplay of perception and production criteria. We find that this optimisation process results in stable gestural sequences that reproduce several known coarticulatory effects, such as the relative order of articulatory events in simple vowel-consonant-vowel sequences and global v gestural sequencing patterns involving consonant clusters. The emergent intergestural phasing patterns are evaluated in light of the leading theories of gestural sequencing in speech. Our results indicate a considerable dependency of intergestural phasing relations on the articulatory nature of the sequenced gestures and the physiological and anatomical properties of the articulators involved in their realisation. These results are sometimes at odds with some of the intergestural phasing principles postulated in Articulatory Phonology, but show a strong agreement with articulatory data collected by phoneticians.
      34