The ability to link a particular phenotype to its causative genotype is one of the most challenging objectives for biological research. Although the genetic code provides an explicit formula for determining the sequence of amino acid phenotypes produced by a given nucleotide sequence, identifying specific residues that are functionally important remains problematic. Many computational approaches have been developed that use patterns observed in DNA sequences to identify these critical sites. However, very few research studies have used empirical data to test whether these approaches are truly able to identify sites of interest.In most empirical studies, the actual protein function and selective pressures are unknown; thus it is difficult to assess whether computational approaches are correctly identifying critical sites. Here I present two studies that utilize well-characterized empirical systems to evaluate and compare the performance of several computational approaches. In both cases, the proteins under study have specific amino acid substitutions that are confirmed to alter protein function and expected to be constrained by natural selection. In chapter 2, I examine functional variants in angiopoietin-like protein 4 (ANGPTL4), a protein involved in regulating plasma triglyceride levels; loss-of-function variants in this gene are believed to decrease the risk of cardiovascular disease. I apply several computational approaches to identify functional variants, including phylogenetic approaches for detecting positive selection. In chapter 3, I investigate the emergence of drug-resistance in HIV-1 during the course of antiretroviral drug therapy. I compare the performance of eight selection detection methods in identifying drug-resistant mutations in 109 intrapatient datasets with HIV-1 sequences isolated at multiple timepoints throughout drug treatment.It is critical that we develop methods to detect positively selected sites. The ability to detect these sites in silico, without the need for expensive and time consuming assays, would be invaluable to researchers in evolutionary biology, human genetics, and medicine. Through the research presented in this thesis, I hope to provide insight into the strengths and weaknesses of current approaches, thereby facilitating future research towards the development and improvement of evolutionary models.
College and Department
Life Sciences; Biology
BYU ScholarsArchive Citation
Bendall, Matthew Lewis, "Evaluating the Performance of Computational Approaches for Identifying Critical Sites in Protein-coding DNA Sequences" (2012). All Theses and Dissertations. 3645.
positive selection, evolutionary models, HIV-1 drug resistance, ANGPTL4