Keywords

neural machine translation, NMT, terminology, term injection, attention vector

Abstract

Many organizations use domain- or organization-specific words and phrases. This paper explores the use of vetted terminology as an input to neural machine translation (NMT) for improved results: ensuring that the translation of individual terms is consistent with an approved multilingual terminology collection. We discuss, implement, and evaluate a method for injecting terminology and for evaluating terminology injection. Our use of the long short-term memory (LSTM) attention mechanism prevalent in state-of-the-art NMT systems involves attention vectors for correctly identifying semantic entities and aligning the tokens that represent them, both in the source and the target languages. Appropriate terminology is then injected into matching alignments during decoding. We also introduce a new translation metric more sensitive to approved terminological content in MT output.

Original Publication Citation

Duane K. Dougal and Deryle W. Lonsdale (2020). Improving NMT Quality Using Terminology Injection. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC2020); European Language Resources Association (ELRA); Marseille, France; pp. 4820-4827.

Document Type

Conference Paper

Publication Date

2020

Publisher

European Language Resources Association

Language

English

College

Humanities

Department

Linguistics

University Standing at Time of Publication

Associate Professor

Included in

Linguistics Commons

Share

COinS