Hidden Markov Models (HMMs) have seen widespread use in a variety of applications ranging from speech recognition to gene prediction. While developed over forty years ago, they remain a standard tool for sequential data analysis. More recently, Latent Dirichlet Allocation (LDA) was developed and soon gained widespread popularity as a powerful topic analysis tool for text corpora. We thoroughly develop LDA and a generalization of HMMs and demonstrate the conjunctive use of both methods in predictive data analysis for health care problems. While these two tools (LDA and HMM) have been used in conjunction previously, we use LDA in a new way to reduce the dimensionality involved in the training of HMMs. With both LDA and our extension of HMM, we train classifiers to predict development of Chronic Kidney Disease (CKD) in the near future.
College and Department
Physical and Mathematical Sciences; Mathematics
BYU ScholarsArchive Citation
Victors, Mason Lemoyne, "A Classification Tool for Predictive Data Analysis in Healthcare" (2013). Theses and Dissertations. 5639.
predictive data analysis, Hidden Markov Models, Latent Dirichlet Allocation, health care, convex analysis, Markov chains, Expectation Maximization, Gibbs sampling, classification tree, random forest