Keywords
Annotation cost, Annotators' skill levels, Machine-assisted pre-annotation, Active Learning, Hourly cost estimation
Abstract
Fixed, limited budgets often constrain the amount of expert annotation that can go into the construction of annotated corpora. Estimating the cost of annotation is the first step toward using annotation resources wisely. We present here a study of the cost of annotation. This study includes the participation of annotators at various skill levels and with varying backgrounds. Conducted over the web, the study consists of tests that simulate machine-assisted pre-annotation, requiring correction by the annotator rather than annotation from scratch. The study also includes tests representative of an annotation scenario involving Active Learning as it progresses from a naïve model to a knowledgeable model; in particular, annotators encounter pre-annotation of varying degrees of accuracy. The annotation interface lists tags considered likely by the annotation model in preference to other tags. We present the experimental parameters of the study and report both descriptive and inferential statistics on the results of the study. We conclude with a model for estimating the hourly cost of annotation for annotators of various skill levels. We also present models for two granularities of annotation: sentence at a time and word at a time.
Original Publication Citation
Eric Ringger, Marc Carmen, Robbie Haertel, Kevin Seppi, Deryle Lonsdale, Peter McClanahan,James Carroll, Noel Ellison (2008). Assessing the Costs of Machine-Assisted Corpus Annotation Through a User Study; Proceedings of the Sixth International Conference on LanguageResources and Evaluation (LREC 2008), pp. 3318-3324; European Language Resources Association.
BYU ScholarsArchive Citation
Lonsdale, Deryle W.; Ringger, Eric K.; Carmen, Marc A.; Haertel, Robbie A.; Seppi, Kevin; McClanahan, Peter J.; Carroll, James; and Ellison, Noel, "Assessing the Costs of Machine-Assisted Corpus Annotation Through a User Study" (2008). Faculty Publications. 6848.
https://scholarsarchive.byu.edu/facpub/6848
Document Type
Conference Paper
Publication Date
2008
Publisher
European Language Resources Association
Language
English
College
Humanities
Department
Linguistics
Copyright Use Information
https://lib.byu.edu/about/copyright/