Presenter/Author Information

Jonathan Goodall, University of Virginia

Keywords

Open Hydrology, Reproducibility, HydroShare, JupyterHub, SUMMA

Start Date

16-9-2020 2:00 PM

End Date

16-9-2020 2:20 PM

Abstract

Technical approaches for enabling more open and reproducible computational models are gaining attention in the environmental modelling and software community. We see three main themes emerging from this research: (1) advancing data sharing platforms, (2) using containers and notebooks for encapsulating complete computational environments and analyses, and (3) developing higher-level Application Programming Interfaces (APIs) for simulation models to make them more scriptable and notebook-friendly. In this research, we explore an approach for leveraging these topics into a model agnostic framework able to support open and reproducible environmental modeling. The framework’s design consists of data sharing achieved through online repositories, notebook-based and containerized modeling analyses in the cloud, and model APIs allowing for the abstraction of lower-level details for model configurations, execution, and visualization. We present an example implementation of the approach using HydroShare as an online repository, CUAHSI and CyberGIS JupyterHubs as computational environments, and pySUMMA as an example model API for the Structure for Unifying Multiple Modeling Alternatives (SUMMA) hydrologic model. We demonstrate how the approach can be leveraged for a study by (1) creating and organizing HydroShare resources for the study’s data and model files and (2) using Jupyter notebooks and pySUMMA to reproduce figures from the past study. We will discuss within this context more nuanced views of reproducibility and remaining challenges to achieving computational reproducibility in environmental modelling not addressed through this research.

Stream and Session

false

COinS
 
Sep 16th, 2:00 PM Sep 16th, 2:20 PM

Enabling More Open and Reproducible Environmental Modelling

Technical approaches for enabling more open and reproducible computational models are gaining attention in the environmental modelling and software community. We see three main themes emerging from this research: (1) advancing data sharing platforms, (2) using containers and notebooks for encapsulating complete computational environments and analyses, and (3) developing higher-level Application Programming Interfaces (APIs) for simulation models to make them more scriptable and notebook-friendly. In this research, we explore an approach for leveraging these topics into a model agnostic framework able to support open and reproducible environmental modeling. The framework’s design consists of data sharing achieved through online repositories, notebook-based and containerized modeling analyses in the cloud, and model APIs allowing for the abstraction of lower-level details for model configurations, execution, and visualization. We present an example implementation of the approach using HydroShare as an online repository, CUAHSI and CyberGIS JupyterHubs as computational environments, and pySUMMA as an example model API for the Structure for Unifying Multiple Modeling Alternatives (SUMMA) hydrologic model. We demonstrate how the approach can be leveraged for a study by (1) creating and organizing HydroShare resources for the study’s data and model files and (2) using Jupyter notebooks and pySUMMA to reproduce figures from the past study. We will discuss within this context more nuanced views of reproducibility and remaining challenges to achieving computational reproducibility in environmental modelling not addressed through this research.