A compound-heterozygous variant occurs when a child inherits a variant from each parent, with these variants occurring at a different position within the same gene and on opposite homologous chromosomes. These inherited variants may result in two nonfunctional versions of the same gene. Compound-heterozygous variants cannot be identified unless a patients' DNA sequence data is phased. Phasing is a computationally demanding process that requires the use of multiple software tools in order to determine which nucleotide was inherited from which parent. First, in Chapter 1, we review the literature to better understand what research has been conducted on the role of compound-heterozygous variants in pediatric cancers and what methods are being used to identify them. In Chapter 2, we develop a pipeline to make it easier for us and other researchers to phase and identify compound-heterozygous variants using VCF files from trios or individuals. We then use this pipeline in Chapter 3 to survey the prevalence of compound-heterozygous variants across 7 pediatric disease types. We show the importance of identifying compound heterozygous and what information would be missed if this variant type was not included in study design. In Chapter 4, we develop a software tool to phase trio data using a combination of Mendelian inheritance logic and an existing phasing software program. We show that our software tool increases the total number of variants that can be phased. Finally, in Chapter 5, we use phased data of three nuclear families, each family having one child with pediatric cancer, to evaluate the potential to use inherited genomic variants to inform diagnostic decisions. The work contained within this dissertation shows the importance of not overlooking compound-heterozygous variants when trying to identify potentially causal genes in pediatric disease. In addition, this work provides software tools that are openly available for other researchers to use; these tools make it easier to phase patient DNA sequence data and to identify compound-heterozygous variants.
College and Department
Life Sciences; Biology
BYU ScholarsArchive Citation
Miller, Dustin B., "Using Phased Whole Genome Sequence Data to Better Understand the Role of Compound-Heterozygous Variants in Pediatric Diseases" (2021). Theses and Dissertations. 9363.
compound heterozygous, trios, nuclear families, phasing, pipeline, whole genome sequencing, pediatric cancer, structural birth defects, compoundHetVIP, trioPhaser