Ragano, AlessandroAlessandroRaganoBenetos, EmmanouilEmmanouilBenetosHines, AndrewAndrewHines2024-02-122024-02-122020 ISCA2020-10-241990-9772http://hdl.handle.net/10197/25431INTERSPEECH 2020, Shanghai, China (held online due to coronavirus outbreak), 25-29 October 2020Objective audio quality assessment is preferred to avoid timeconsuming and costly listening tests. The development of objective quality metrics depends on the availability of datasets appropriate to the application under study. Currently, a suitable human-annotated dataset for developing quality metrics in archive audio is missing. Given the online availability of archival recordings, we propose to develop a real-world audio quality dataset. We present a methodology used to curate a speech quality database using the archive recordings from the Apollo Space Program. The proposed procedure is based on two steps: a pilot listening test and an exploratory data analysis. The pilot listening test shows that we can extract audio clips through the control of speech-to-text performance metrics to prevent data repetition. Through unsupervised exploratory data analysis, we explore the characteristics of the degradations. We classify distinct degradations and we study spectral, intensity, tonality and overall quality properties of the data through clustering techniques. These results provide the necessary foundation to support the subsequent development of large-scale crowdsourced datasets for audio quality.enSpeech qualitySpeech intelligibilityApollo space programSound archivesDevelopment of a Speech Quality Database Under Uncontrolled ConditionsConference Publication10.21437/Interspeech.2020-18992021-01-2617/RC-PhD/348317/RC/2289 P2EP/N510129/1https://creativecommons.org/licenses/by-nc-nd/3.0/ie/