Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
    Colleges & Schools
    Statistics
    All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. College of Science
  3. School of Computer Science
  4. Computer Science Research Collection
  5. Following the Embedding: Identifying Transition Phenomena in Wav2vec 2.0 Representations of Speech Audio
 
  • Details
Options

Following the Embedding: Identifying Transition Phenomena in Wav2vec 2.0 Representations of Speech Audio

Author(s)
English, Patrick Cormac  
Shams, Erfan A.  
Kelleher, John  
Carson-Berndsen, Julie  
Uri
http://hdl.handle.net/10197/26389
Date Issued
2024-04-19
Date Available
2024-07-11T09:21:29Z
Abstract
Although transformer-based models have improved the state-of-the-art in speech recognition, it is still not well understood what information from the speech signal these models encode in their latent representations. This study investigates the potential of using labelled data (TIMIT) to probe wav2vec 2.0 embeddings for insights into the encoding and visualisation of speech signal information at phone boundaries. Our experiment involves training probing models to detect phone-specific articulatory features in the hidden layers based on IPA classifications. Furthermore, we propose an analysis framework for visualising the probabilities of the detected articulatory features in every layer and frame vector. Our primary focus is to probe and better understand the structure of speech signal information in the embeddings learned by unsupervised transformers, with a view to contributing to more explainable speech processing systems.
Sponsorship
Science Foundation Ireland
Type of Material
Conference Publication
Publisher
IEEE
Start Page
6685
End Page
6689
Subjects

Speech recognition

Phonetic representati...

Probing

Explainable AI

DOI
10.1109/icassp48485.2024.10446494
Language
English
Status of Item
Peer reviewed
Journal
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Conference Details
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Seoul, Korea, 14-19 April 2024
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by/3.0/ie/
File(s)
Loading...
Thumbnail Image
Name

english.pdf

Size

1.19 MB

Format

Adobe PDF

Checksum (MD5)

0206c8aea6817ccc45999252c727b7a1

Owning collection
Computer Science Research Collection

Item descriptive metadata is released under a CC-0 (public domain) license: https://creativecommons.org/public-domain/cc0/.
All other content is subject to copyright.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement