speech recognizer, word accuracy, channel model, post-correction
We have implemented a post-processor called SPEECHPP to correct word-level errors committed by an arbitrary speech recognizer. Applying a noisy-channel model, SPEECHPP uses a Viterbi beam-search that employs language and channel models. Previous work demonstrated that a simple word-for-word channel model was sufficient to yield substantial incieases in word accuracy. This paper demonstrates that some improvements in word accuracy result from augmenting the channel model with an account of word fertility in the channel. This work further demonstrates that a modern continuous speech recognizer can be used in "black-box" fashion for robustly recognizing speech for which the recognizer was not originally trained. This work also demonstrates that in the case when the recognizer can be tuned to the new task, environment, or speaker, the post-processor can also contribute to performance improvements.
Original Publication Citation
Eric K. Ringger and James F. Allen. October 1996. "A Fertility Channel Model for Post-Correction of Continuous Speech Recognition." Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP'96). Philadelphia, PA.
BYU ScholarsArchive Citation
Ringger, Eric K. and Allen, James F., "A Fertility Channel Model for Post-Correction of Continuous Speech Recognition" (1996). All Faculty Publications. 1288.
Physical and Mathematical Sciences
© 1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Copyright Use Information