An artificial neural network (ANN) is a powerful mathematical framework used to either model complex relationships between inputs and outputs or find patterns in data. It is based on an interconnected group of artificial neurons, and it employs a connectionist approach to computation when processing information. ANNs have been successfully used for a great variety of applications, such as decision making, quantum chemistry, radar systems, face identification, gesture recognition, handwritten text recognition, medical diagnosis, financial applications, robotics, data mining, and e-spam filtering. In the speech community, neural architectures have been used since the beginning of the 1980s, and ANNs have been proven useful to accomplish several speech processing tasks, e.g., to extract linguistically motivated features, to perform speech detection, and to generate local scores to be used for different goals. In recent years, there has been a renewed interest in the use of ANNs for speech applications due to a major advance made in pre-training the weights in deep neural networks (DNNs). It seems that a new trend to move the speech technology forward through the use of NNs has begun, and it can therefore be instructive to review key ANN applications to automatic speech processing. In this paper, several ANN-based applications for speech processing will be presented, ranging from speech attribute extraction to phoneme estimation and/or classification. Furthermore, it will be shown that ANNs play a key role in several important speech applications, such as large vocabulary continuous speech recognition (LVCSR) and automatic language recognition. The goal of the paper is to summarize chief ANN approaches to speech processing using the experience gathered in the last seven years in our laboratories.

An artificial neural network approach to automatic speech processing

S. M. SINISCALCHI
Investigation
;
2014-01-01

Abstract

An artificial neural network (ANN) is a powerful mathematical framework used to either model complex relationships between inputs and outputs or find patterns in data. It is based on an interconnected group of artificial neurons, and it employs a connectionist approach to computation when processing information. ANNs have been successfully used for a great variety of applications, such as decision making, quantum chemistry, radar systems, face identification, gesture recognition, handwritten text recognition, medical diagnosis, financial applications, robotics, data mining, and e-spam filtering. In the speech community, neural architectures have been used since the beginning of the 1980s, and ANNs have been proven useful to accomplish several speech processing tasks, e.g., to extract linguistically motivated features, to perform speech detection, and to generate local scores to be used for different goals. In recent years, there has been a renewed interest in the use of ANNs for speech applications due to a major advance made in pre-training the weights in deep neural networks (DNNs). It seems that a new trend to move the speech technology forward through the use of NNs has begun, and it can therefore be instructive to review key ANN applications to automatic speech processing. In this paper, several ANN-based applications for speech processing will be presented, ranging from speech attribute extraction to phoneme estimation and/or classification. Furthermore, it will be shown that ANNs play a key role in several important speech applications, such as large vocabulary continuous speech recognition (LVCSR) and automatic language recognition. The goal of the paper is to summarize chief ANN approaches to speech processing using the experience gathered in the last seven years in our laboratories.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11387/68126
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 73
  • ???jsp.display-item.citation.isi??? 56
social impact