Keywords
neural machine translation, NMT, terminology, term injection, attention vector
Abstract
Many organizations use domain- or organization-specific words and phrases. This paper explores the use of vetted terminology as an input to neural machine translation (NMT) for improved results: ensuring that the translation of individual terms is consistent with an approved multilingual terminology collection. We discuss, implement, and evaluate a method for injecting terminology and for evaluating terminology injection. Our use of the long short-term memory (LSTM) attention mechanism prevalent in state-of-the-art NMT systems involves attention vectors for correctly identifying semantic entities and aligning the tokens that represent them, both in the source and the target languages. Appropriate terminology is then injected into matching alignments during decoding. We also introduce a new translation metric more sensitive to approved terminological content in MT output.
Original Publication Citation
Duane K. Dougal and Deryle W. Lonsdale (2020). Improving NMT Quality Using Terminology Injection. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC2020); European Language Resources Association (ELRA); Marseille, France; pp. 4820-4827.
BYU ScholarsArchive Citation
Lonsdale, Deryle W. and Dougal, Duane K., "Improving NMT Quality Using Terminology Injection" (2020). Faculty Publications. 6873.
https://scholarsarchive.byu.edu/facpub/6873
Document Type
Conference Paper
Publication Date
2020
Publisher
European Language Resources Association
Language
English
College
Humanities
Department
Linguistics
Copyright Status
© European Language Resources Association (ELRA)
Copyright Use Information
https://lib.byu.edu/about/copyright/