Date of Submission

Spring 2016

Academic Programs and Concentrations

Computer Science; Philosophy

Project Advisor 1

Sven Anderson

Abstract/Artist's Statement

Senior Project submitted to The Division of Science, Mathematics and Computing of Bard College.

Abstract

Emotion recognition in speech using deep learning begins as a problem of translating raw auditory data into an informationally rich feature set that can be trained on by a neural network and, ideally, result in a machine learning system capable of accurately classifying the paralinguistic content of speech. We performed feature extraction using Praat, a tool for phonetic analysis, and obtained a variety of harmonic, intensity, and spectral characteristics that together formed the basis for the training vectors in our machine learning system.

While a number of different machine learning approaches have proved successful, there has been a strong resurgence in the application of so-called ‘deep learning’ systems to machine learning problems due to the striking degree of success that has been achieved with them using modern hardware [8]. After empirically validating a network architecture and learning parameters we trained six neural networks for the problem of emotion classification. We used six independent networks, each with the same network architecture and learning parameters, because this allowed us to obtain empirical data about the relative efficacy of different training features. Our six training sets represent the pairwise extraction of different subsets of features from a larger pool of features ranging from spectral and energy characteristics to periodicity. This allowed us to paint a more general picture of which facets of human speech are most relevant to and indicative of emotionality in speech.

Access Agreement

On-Campus only

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Recommended Citation

Segal, Eli Ridley, "Paralinguistic Speech Recognition: Classifying Emotion in Speech with Deep Learning Neural Networks" (2016). Senior Projects Spring 2016. 363.
https://digitalcommons.bard.edu/senproj_s2016/363

Download

This work is protected by a Creative Commons license. Any use not permitted under that license is prohibited.

Bard Off-campus Download

Bard College faculty, staff, and students can login from off-campus by clicking on the Off-campus Download button and entering their Bard username and password.

COinS

Senior Projects Spring 2016

Paralinguistic Speech Recognition: Classifying Emotion in Speech with Deep Learning Neural Networks

Date of Submission

Academic Programs and Concentrations

Project Advisor 1

Abstract/Artist's Statement

Access Agreement

Creative Commons License

Recommended Citation

Search the Site

Browse the Commons

Author Corner

Senior Projects Spring 2016

Paralinguistic Speech Recognition: Classifying Emotion in Speech with Deep Learning Neural Networks

Author

Date of Submission

Academic Programs and Concentrations

Project Advisor 1

Abstract/Artist's Statement

Access Agreement

Creative Commons License

Recommended Citation

Share

Search the Site

Browse the Commons

Author Corner