Research on the dialects of English spoken within the United States shows variation regarding lexical, morphological, syntactic, and phonological features. Previous research has tended to focus on one linguistic variable at a time with variation. To incorporate multiple variables in the same analysis, this thesis uses a latent class analysis to perform a cluster analysis on results from the Harvard Dialect Survey (2003) in order to investigate what phonetic variables from the Harvard Dialect Survey are most closely associated with each dialect. This thesis also looks at how closely the latent class analysis results correspond to the Atlas of North America (Labov, Ash & Boberg, 2005b) and how well the results correspond to Joshua Katz's heat maps (Business Insider, 2013; Byrne, 2013; Huffington Post, 2013; The Atlantic, 2013). The results from the Harvard Dialect Survey generally parallel the findings of the Linguistic Atlas of North American English, providing support for six basic dialects of American English. The variables with the highest probability of occurring in the North dialect are ‘pajamas: /æ/’, ‘coupon: /ju:/’, ‘Monday, Friday: /e:/’ ‘Florida: /ɔ/’, and ‘caramel: 2 syllables’. For the South dialect, the top variables are ‘handkerchief: /ɪ/’, ‘lawyer: /ɒ/’, ‘pajamas: /ɑ/’, and ‘poem’ as 2 syllables. The top variables in the West dialect include ‘pajamas: /ɑ/’, ‘Florida: /ɔ/’, ‘Monday, Friday: /e:/’, ‘handkerchief: /ɪ/’, and ‘lawyer: /ɔj/’. For the New England dialect, they are ‘Monday, Friday: /e:/’, ‘route: /ru:t/’, ‘caramel: 3 syllables’, ‘mayonnaise: /ejɑ/’, and ‘lawyer: /ɔj/’. The top variables for the Midland dialect are ‘pajamas: /æ/’, ‘coupon: /u:/’, ‘Monday, Friday: /e:/’, ‘Florida: /ɔ/’, and ‘lawyer: /ɔj/’ and for New York City and the Mid-Atlantic States, they are ‘handkerchief: /ɪ/’, ‘Monday, Friday: /e:/’, ‘pajamas: /ɑ/’, ‘been: /ɪ/’, ‘route: /ru:t/’, ‘lawyer: /ɔj/’, and ‘coupon: /u:/’. One major discrepancy between the results from the latent class analysis and the linguistic atlas is the region of the low back merger. In the latent class analysis, the North dialect has a low probability of the ‘cot/caught’ low back vowel distinction, whereas the linguistic atlas found this to be a salent variable of the North dialect. In conclusion, these results show that the latent class analysis corresponds with current research, as well as adding additional information with multiple variables.



College and Department

Humanities; Linguistics and English Language



Date Submitted


Document Type





American English dialects, latent class analysis, dialect variation



Included in

Linguistics Commons