An important challenge in natural language surface realization is the generation of grammatical sentences from incomplete sentence plans. Realization can be broken into a two-stage process consisting of an over-generating rule-based module followed by a ranker that outputs the most probable candidate sentence based on a statistical language model. Thus far, an n-gram language model has been evaluated in this context. More sophisticated syntactic knowledge is expected to improve such a ranker. In this thesis, a new language model based on featurized functional dependency syntax was developed and evaluated. Generation accuracies and cross-entropy for the new language model did not beat the comparison bigram language model.
College and Department
Physical and Mathematical Sciences; Computer Science
BYU ScholarsArchive Citation
Packer, Thomas L., "Surface Realization Using a Featurized Syntactic Statistical Language Model" (2006). All Theses and Dissertations. 384.
natural language generation, natural language processing, NLP, NLG, Bayesian networks, decision trees, context specific independence, realization, statistical language model, standard pipeline architecture, n-gram (bigram) language model, syntax, features, statistical model, machine learning