Keywords

emotion, vocal emotion, acting, psychology, neural networks, artificial intelligence, voice

Data Set Description Summary

This corpus contains over 50 hours of voice acted readings as part of a dissertation project. These recordings represent one of four emotions or the subject's normal speaking voice. The four emotions acted are: anger, fear, happiness, and sadness. These recordings can be useful for building a simple emotion recognition model. Data were collected from BYU students in 2019. Supporting documents designed for people wanting replicate this project are included in Documents.zip. For more information about these files, read the README file.

All audio files are encoded as 44.1 kHz .wav mono audio. Each subject read the 50-word script multiple times, with one intact reading found in each recording. Regular_Instructions.zip represents the most complete and polished set of data, with 120 subjects represented in 11,831 .wav files. Variety_Instructions.zip are from these same 120 subjects, but in a separate task, which involved reading the script while acting one emotion in as many different ways as possible. This folder contains 997 .wav files.

Various script-reading issues are found in Additional_Corpus_Files.zip for the same 120 subjects.

An additional 11 subjects are included in Additional_Subjects.zip, who have too few good recordings in each emotion, prevalent background noise, or had the research assistant end the recording session prematurely.

Additional audio files are found in Corpus_Files_Without_Forced_Alignment.zip, which contains 168 files that failed one analysis step.

Full_TextGrids.zip contains information for timing of words and sounds in each audio file.

This data set is licensed under CC BY-NC-SA 4.0, explained in the LICENSE file.