Abstract

This thesis contributes to the discussion of register variation within Egyptian Arabic by focusing on the usage of verbs in blogs and transcripts of movies and television. Register variation has been extensively researched for English as well as several other languages; yet, the lexical and grammatical features that distinguish registers of Egyptian Arabic have not been analyzed. Several challenges have prevented such an analysis, among them the perceived lack of an automatic annotator and the uncertainty of results. In order to overcome these challenges, two corpora were compiled: one containing texts from blogs and the other transcripts of movies and television shows. With each corpus representing a potential register of the dialect, the verbs in each corpus were lemmatized and semi-automatically annotated for either aspect or mood. The verbs were then counted according to lemma, aspect, and mood in order to determine the extent of variance between the two corpora. The effectiveness of the state-of-the-art automatic annotator was also evaluated by comparing the counts it provided to those produced from corrections of its output. This thesis found that verbs are in fact used differently in the two corpora suggesting register variation and identified potential verbal features characteristic of each register. It also found that the automatic tagger produced counts that lead to the same conclusions as the corrected annotation.

Degree

MA

College and Department

Humanities

Rights

http://lib.byu.edu/about/copyright/

Date Submitted

2019-12-01

Document Type

Thesis

Keywords

register variation, corpus, Egyptian Arabic

Language

english

Share

COinS