Abstract
Neural networks have long been known as universal function approximators and have more recently been shown to be powerful and versatile in practice. But it can be extremely challenging to find the right set of parameters and hyperparameters. Model training is both expensive and difficult due to the large number of parameters and sensitivity to hyperparameters such as learning rate and architecture. Hyperparameter searches are notorious for requiring tremendous amounts of processing power and human resources. This thesis provides an analytic approach to estimating the optimal value of one of the key hyperparameters in neural networks, the learning rate. Where possible, the analysis is computed exactly, and where necessary, approximations and assumptions are used and justified. The result is a method that estimates the optimal learning rate for a certain type of network, a fully connected CReLU network.
Degree
MS
College and Department
Physical and Mathematical Sciences; Mathematics
Rights
https://lib.byu.edu/about/copyright/
BYU ScholarsArchive Citation
Moncur, Tyler, "Optimal Learning Rates for Neural Networks" (2020). Theses and Dissertations. 8662.
https://scholarsarchive.byu.edu/etd/8662
Date Submitted
2020-07-30
Document Type
Thesis
Handle
http://hdl.lib.byu.edu/1877/etd11408
Keywords
neural network, learning rate, crelu
Language
English