Abstract

Time delays are an inherent part of real-world systems. Besides the apparent slowing of the system, these time delays often cause destabilization in otherwise stable systems, and perhaps even more unexpectedly, can stabilize an unstable system. Here, we propose the Stochastic Time-Delayed Adaptation as a method for improving optimization on certain high-dimensional surfaces, which simply wraps a known optimizer --such as the Adam optimizer-- and is able to add a variety of time-delays. We begin by exploring time delays on certain gradient-based optimization methods and their affect on the optimizer's convergence properties. These optimizers include the standard gradient descent method and the more recent Adam Optimizer, where the latter is commonly used in neural networks for deep learning. To begin to describe the effect of time-delays on these methods, we use the theory of intrinsic stability. It has been shown that a system that possesses the property of intrinsic stability (a stronger form of global stability) will maintain its stability when subject to any time delays, e.g., constant, periodic, stochastic, etc. In feasible cases, we find relevant conditions under which the optimization method adapted with time delays is intrinsically stable and therefore converges to the system's minimal value. Finally, we examine the optimizer's performance using common optimizer performance metrics. This includes the number of steps an algorithm takes to converge and also the final loss value in relation to the global minimum of the loss function. We test these outcomes using various adaptations of the Adam optimizer on multiple common test optimization functions, which are designed to be difficult for vanilla optimizer methods. We show that the Stochastic Time-Delayed Adaptation can greatly improve an optimizer's ability to find a global minimum of a complex loss function.

Degree

MS

College and Department

Physical and Mathematical Sciences; Mathematics

Rights

https://lib.byu.edu/about/copyright/

Date Submitted

2023-04-24

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd13208

Keywords

optimization, time delays, adam optimizer, gradient descent, intrinsic stability

Language

english

Share

COinS