Despite the remarkable progress in end-to-end Automatic Speech Recognition (ASR) engines, accurately transcribing dysarthric speech remains a major challenge. In this work, we proposed a two-stage framework for the Speech Accessibility Project Challenge at INTERSPEECH 2025, which combines cutting-edge speech recognition models with LLM-based gen- erative error correction (GER). We assess different configura- tions of model scales and training strategies, incorporating spe- cific hypothesis selection to improve transcription accuracy. Ex- periments on the Speech Accessibility Project dataset demon- strate the strength of our approach on structured and spon- taneous speech, while highlighting challenges in single-word recognition. Through comprehensive analysis, we provide in- sights into the complementary roles of acoustic and linguistic modeling in dysarthric speech recognition.

Exploring Generative Error Correction for Dysarthric Speech Recognition

Moreno La Quatra;Valerio Mario Salerno;Sabato Marco Siniscalchi
2025-01-01

Abstract

Despite the remarkable progress in end-to-end Automatic Speech Recognition (ASR) engines, accurately transcribing dysarthric speech remains a major challenge. In this work, we proposed a two-stage framework for the Speech Accessibility Project Challenge at INTERSPEECH 2025, which combines cutting-edge speech recognition models with LLM-based gen- erative error correction (GER). We assess different configura- tions of model scales and training strategies, incorporating spe- cific hypothesis selection to improve transcription accuracy. Ex- periments on the Speech Accessibility Project dataset demon- strate the strength of our approach on structured and spon- taneous speech, while highlighting challenges in single-word recognition. Through comprehensive analysis, we provide in- sights into the complementary roles of acoustic and linguistic modeling in dysarthric speech recognition.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11387/198240
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact