Faculty Publications

Evaluating machine-assisted annotation in under-resourced settings

Deryle W. Lonsdale, Brigham Young UniversityFollow
Paul L. Felt, Brigham Young University - ProvoFollow
Eric K. Ringger, Brigham Young University - ProvoFollow
Kevin Seppi, Brigham Young UniversityFollow
Kristian Heal, Brigham Young University - ProvoFollow
Robbie A. Haertel, Brigham Young University - ProvoFollow

Keywords

Annotation, Corpus annotation, Machine assistance, Syriac studies, Bayesian data analysis, User study, Language resource evaluation

Abstract

Machine assistance is vital to managing the cost of corpus annotation projects. Identifying effective forms of machine assistance through principled evaluation is particularly important and challenging in under-resourced domains and highly heterogeneous corpora, as the quality of machine assistance varies. We perform a fine-grained evaluation of two machine-assistance techniques in the context of an under-resourced corpus annotation project. This evaluation requires a carefully controlled user study crafted to test a number of specific hypotheses. We show that human annotators performing morphological analysis of text in a Semitic language perform their task significantly more accurately and quickly when even mediocre pre-annotations are provided. When pre-annotations are at least 70 % accurate, annotator speed and accuracy show statistically significant relative improvements of 25–35 and 5–7 %, respectively. However, controlled user studies are too costly to be suitable for under-resourced corpus annotation projects. Thus, we also present an alternative analysis methodology that models the data as a combination of latent variables in a Bayesian framework. We show that modeling the effects of interesting confounding factors can generate useful insights. In particular, correction propagation appears to be most effective for our task when implemented with minimal user involvement. More importantly, by explicitly accounting for confounding variables, this approach has the potential to yield fine-grained evaluations using data collected in a natural environment outside of costly controlled user studies.

Original Publication Citation

Paul Felt, Eric K. Ringger, Kevin Seppi, Kristian S. Heal, Robbie A. Haertel, and Deryle Lonsdale (2014). Evaluating machine-assisted annotation in under-resourced settings. LanguageResources & Evaluation 48(4):561-599. Springer Science+Business Media, (Online publication date November 2013).

BYU ScholarsArchive Citation

Lonsdale, Deryle W.; Felt, Paul L.; Ringger, Eric K.; Seppi, Kevin; Heal, Kristian; and Haertel, Robbie A., "Evaluating machine-assisted annotation in under-resourced settings" (2013). Faculty Publications. 6881.
https://scholarsarchive.byu.edu/facpub/6881

Document Type

Peer-Reviewed Article

Publication Date

2013-11

Publisher

Springer Science+Business Media

Language

English

College

Humanities

Department

Linguistics

University Standing at Time of Publication

Associate Professor

Copyright Status

Copyright Use Information

https://lib.byu.edu/about/copyright/

Link to Full Text

COinS

BYU ScholarsArchive

Faculty Publications

Evaluating machine-assisted annotation in under-resourced settings

Keywords

Abstract

Original Publication Citation

BYU ScholarsArchive Citation

Document Type

Publication Date

Publisher

Language

College

Department

University Standing at Time of Publication

Copyright Status

Copyright Use Information

Search

Browse

BYU Links

Author Corner

Hosted by the

BYU ScholarsArchive

Faculty Publications

Evaluating machine-assisted annotation in under-resourced settings

Authors

Keywords

Abstract

Original Publication Citation

BYU ScholarsArchive Citation

Document Type

Publication Date

Publisher

Language

College

Department

University Standing at Time of Publication

Copyright Status

Copyright Use Information

Share

Search

Browse

BYU Links

Author Corner

Hosted by the