Advances in water resources modeling are improving the information that can be supplied to support decisions that affect the safety and sustainability of society, but these advances result in models being more computationally demanding. To facilitate the use of cost- effective computing resources to meet the increased demand through high-throughput computing (HTC) and cloud computing in modeling workflows and web applications, I developed a comprehensive Python toolkit that provides the following features: (1) programmatic access to diverse, dynamically scalable computing resources; (2) a batch scheduling system to queue and dispatch the jobs to the computing resources; (3) data management for job inputs and outputs; and (4) the ability for jobs to be dynamically created, submitted, and monitored from the scripting environment. To compose this comprehensive computing toolkit, I created two Python libraries (TethysCluster and CondorPy) that leverage two existing software tools (StarCluster and HTCondor). I further facilitated access to HTC in web applications by using these libraries to create powerful and flexible computing tools for Tethys Platform, a development and hosting platform for web-based water resources applications. I tested this toolkit while collaborating with other researchers to perform several modeling applications that required scalable computing. These applications included a parameter sweep with 57,600 realizations of a distributed, hydrologic model; a set of web applications for retrieving and formatting data; a web application for evaluating the hydrologic impact of land-use change; and an operational, national-scale, high- resolution, ensemble streamflow forecasting tool. In each of these applications the toolkit was successful in automating the process of running the large-scale modeling computations in an HTC environment.
College and Department
Ira A. Fulton College of Engineering and Technology; Civil and Environmental Engineering
BYU ScholarsArchive Citation
Christensen, Scott D., "A Comprehensive Python Toolkit for Harnessing Cloud-Based High-Throughput Computing to Support Hydrologic Modeling Workflows" (2016). All Theses and Dissertations. 5667.
cloud computing, high-throughput computing, Tethys Platform, GSSHA, hydrologic modeling, Python