Abstract

Automated scoring of essays has been a research topic for some time in computational linguistics studies. Only recently have the particular challenges of automatic holistic scoring of ESL essays with their high grammatical, spelling and other error rates been a topic of research. This thesis evaluates the effectiveness of using statistical measures of linguistic maturity to predict holistic scores for ESL essays using several techniques. Selected linguistic attributes include parts of speech, part-of-speech patterns, vocabulary density, and sentence and essay lengths. Using customized algorithms based on multivariable regression analysis as well as memory-based machine learning, holistic scores were predicted on test essays within ±1.0 of the scoring level of human judges' scores successfully an average of 90% of the time. This level of prediction is an improvement over a 66% prediction level attained in a previous study using customized algorithms.

Degree

MA

College and Department

Humanities; Linguistics and English Language

Rights

http://lib.byu.edu/about/copyright/

Date Submitted

2006-07-21

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd1507

Keywords

ESL, Holistic Score, Essay Analysis, Machine Learning, Linear Regression Analysis, WordMap, TiMBL

Included in

Linguistics Commons

Share

COinS