Abstract
Low-resource languages, including sign languages, are a challenge for machine translation research. Given the lack of parallel corpora, current researchers must be content with a small parallel corpus in a narrow domain for training a system. For this thesis, we obtained a small parallel corpus of English text and American Sign Language gloss from The Church of Jesus Christ of Latter-day Saints. We cleaned the corpus by loading it into an open-source translation memory tool, where we removed computer markup language and split the large chunks of text into sentences and phrases, creating a total of 14,247 sentence pairs. We randomly partitioned the corpus into three sections: 70% for a training set, 10% for a development set, and 20% for a test set. After downloading and installing the open-source Moses toolkit, we went through several iterations of training, translating, and evaluating the system. The final evaluation on unseen data yielded a state-of-the-art score for a low-resource language.
Degree
MA
College and Department
Humanities; Linguistics and English Language
Rights
http://lib.byu.edu/about/copyright/
BYU ScholarsArchive Citation
Bonham, Mary Elizabeth, "English to ASL Gloss Machine Translation" (2015). Theses and Dissertations. 5478.
https://scholarsarchive.byu.edu/etd/5478
Date Submitted
2015-06-01
Document Type
Thesis
Handle
http://hdl.lib.byu.edu/1877/etd8680
Keywords
machine translation, ASL, sign language gloss
Language
english