Methods for transcription and forced alignment of a legacy speech corpus


DASS, DARLA, Digital Archive of Southern Speech, transcription, analysis


This paper describes the transcription and forced alignment of the Digital Archive of Southern Speech (DASS), a subset of the Linguistic Atlas of the Gulf States comprising 372 hours of recordings (64 interviews) conducted across eight southern U.S. states from 1968 to 1983. This project provides a large corpus of historical, semi-spontaneous Southern speech, time-aligned to the audio for acoustic analysis. Manual orthographic transcription of full DASS interviews is carried out according to in-house guidelines that ensure consistency across files and transcribers. Separate codes are used for the interviewee, interviewer, nonspeech, overlapping and unintelligible speech. Transcriber output is converted to Praat TextGrids using scripts from LaBB-CAT, a tool for maintaining large speech corpora. TextGrids containing only the interviewee’s speech are generated, and subjected to forced alignment by DARLA, which accommodates the levels of variation and noise in the DASS files with high degrees of success. Toward acoustic analysis, four methods for vowel formant extraction are evaluated: the native output of DARLA, FAVE, a local implemen-tation of FAVE-Extract, and a Praat-based extractor that incorporates separate formant tracks for different regions of the vowel space. The workflow of transcription and analysis is presented to benefit other projects of similar size and scope.

Original Publication Citation

Rachel M. Olsen, Michael L. Olsen, Joseph A. Stanley, Margaret E. L. Renwick, & William A. Kretzschmar, Jr. 2017. “Methods for transcription and forced alignment of a legacy speech corpus.”Proceedings of Meetings on Acoustics 30, 060001; doi:http://dx.doi.org/10.1121/2.0000559.

Document Type

Peer-Reviewed Article

Publication Date



Acoustical Society of America







University Standing at Time of Publication

Assistant Professor