Abstract

One of the primary goals of active research in molecular biology is to better understand the process of transcription regulation. An important objective in understanding transcription is identifying transcription factors that directly regulate target genes. Identifying these transcription factors is a key step toward eliminating genetic diseases or disease susceptibilities that are encoded inside deoxyribonucleic acid (DNA). There is much uncertainty and variation associated with transcription factor binding sites, requiring these sites to be represented stochastically. Although typically each transcription factor prefers to bind to a specific DNA word, it can bind to different variations of that DNA word. In order to model these uncertainties, we use a Bayesian approach that allows the binding probabilities associated with the motif to vary. This project presents a new method for motif searching that uses expert prior information to scan DNA sequences for multiple known motif binding sites as well as new motifs. The method uses a mixture model to model the motifs of interest where each motif is represented by a Multinomial distribution, and Dirichlet prior distributions are placed on each motif of interest. Expert prior information is given to search for known motifs and diffuse priors are used to search for new motifs. The posterior distribution of each motif is then sampled using Markov Chain Monte Carlo (MCMC) techniques and Gibbs sampling.

Degree

MS

College and Department

Physical and Mathematical Sciences; Statistics

Rights

http://lib.byu.edu/about/copyright/

Date Submitted

2009-04-16

Document Type

Selected Project

Handle

http://hdl.lib.byu.edu/1877/etd2886

Keywords

motif, transcription factor, Gibbs sampling, DNA, XPRIME, ETS, RUNX

Share

COinS