Date of Award
Speech recognition is one of the most important problems in artificial intelligence today. Despite numerous advances made over the past few decades, the performance of modern speech recognition programs often leaves much to be desired. This is not for a lack of effort, but simply because recognizing human speech is actually a very difficult problem to solve. This challenge is largely invisible to us, because our auditory systems are extremely powerful speech recognizers. In this thesis, we will explore the processes involved in human speech perception, the algorithms used for computer speech recognition, and discuss a way to use neural computing to make speech recognizers more effective. There are two chief reasons speech recognition is such a difficult problem. The first is that speech is difficult to characterize: we do not have a reliable way to tell what linguistic unit a given utterance is supposed to correspond to. The second is that natural language is difficult to understand, and a great deal of human speech perception’s robustness comes from our ability to use linguistic cues to fill in for uncertain sounds. This thesis primarily focuses on a possible solution to the first problem. Most speech recognition programs use parametric statistical models to classify utterances, which make a number of potentially inaccurate assumptions about the nature of speech. I propose that we instead use neural networks, a method of classification that resembles the neurons in our brain, to recognize sounds. By combining neural networks with effective signal processing and hidden Markov models, the core components of most practical speech recognizers, we might be able to recognize speech more effectively.
Thompson, Nathaniel, "Recognizing Speech with Neural Networks" (2011). Senior Theses. 566.