Theses and Dissertations

More is Better than One: The Effect of Ensembling on Deep Learning Performance in Biochemical Prediction Problems

Jacob A. Stern, Brigham Young UniversityFollow

Abstract

This thesis presents two papers addressing important biochemical prediction challenges. The first paper focuses on accurate protein distance predictions and introduces updates to the ProSPr network. We evaluate its performance in the Critical Assessment of techniques for Protein Structure Prediction (CASP14) competition, investigating its accuracy dependence on sequence length and multiple sequence alignment depth. The ProSPr network, an ensemble of three convolutional neural networks (CNNs), demonstrates superior performance compared to individual networks. The second paper addresses the issue of accurate ligand ranking in virtual screening for drug discovery. We propose MILCDock, a machine learning consensus docking tool that leverages predictions from five traditional molecular docking tools. MILCDock, an ensemble of eight neural networks, outperforms single-network approaches and other consensus docking methods on the DUD-E dataset. However, we find that LIT-PCBA targets remain challenging for all methods tested. Furthermore, we explore the effectiveness of training machine learning tools on the biased DUD-E dataset, emphasizing the importance of mitigating its biases during training. Collectively, this work emphasizes the power of ensembling in deep learning-based biochemical prediction problems, highlighting improved performance through the combination of multiple models. Our findings contribute to the development of robust protein distance prediction tools and more accurate virtual screening methods for drug discovery.

Degree

College and Department

Physical and Mathematical Sciences; Computer Science

Rights

https://lib.byu.edu/about/copyright/

BYU ScholarsArchive Citation

Stern, Jacob A., "More is Better than One: The Effect of Ensembling on Deep Learning Performance in Biochemical Prediction Problems" (2023). Theses and Dissertations. 10123.
https://scholarsarchive.byu.edu/etd/10123

Date Submitted

2023-08-07

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd12961

Keywords

deep learning, protein structure prediction, docking, ensembles

Language

english

Download

Included in

Physical Sciences and Mathematics Commons

COinS

BYU ScholarsArchive

Theses and Dissertations

More is Better than One: The Effect of Ensembling on Deep Learning Performance in Biochemical Prediction Problems

Abstract

Degree

College and Department

Rights

BYU ScholarsArchive Citation

Date Submitted

Document Type

Handle

Keywords

Language

Included in

Search

Browse

BYU Links

Author Corner

Hosted by the

BYU ScholarsArchive

Theses and Dissertations

More is Better than One: The Effect of Ensembling on Deep Learning Performance in Biochemical Prediction Problems

Author

Abstract

Degree

College and Department

Rights

BYU ScholarsArchive Citation

Date Submitted

Document Type

Handle

Keywords

Language

Included in

Share

Search

Browse

BYU Links

Author Corner

Hosted by the