Abstract
Neural networks can perform an incredible array of complex tasks, but successfully training a network is difficult because it requires us to minimize a function about which we know very little. In practice, developing a good model requires both intuition and a lot of guess-and-check. In this dissertation, we study a type of fully-connected neural network that improves on standard rectifier networks while retaining their useful properties. We then examine this type of network and its loss function from a probabilistic perspective. This analysis leads to a new rule for parameter initialization and a new method for predicting effective learning rates for gradient descent. Experiments confirm that the theory behind these developments translates well into practice.
Degree
PhD
College and Department
Physical and Mathematical Sciences; Mathematics
Rights
http://lib.byu.edu/about/copyright/
BYU ScholarsArchive Citation
Hettinger, Christopher James, "Hyperparameters for Dense Neural Networks" (2019). Theses and Dissertations. 7531.
https://scholarsarchive.byu.edu/etd/7531
Date Submitted
2019-07-01
Document Type
Dissertation
Handle
http://hdl.lib.byu.edu/1877/etd12249
Keywords
neural networks, backpropagation, gradient descent, crelu
Language
english