Keywords
computational reproducibility, hydrologic modeling, MODFLOW, metadata
Start Date
28-6-2018 9:00 AM
End Date
28-6-2018 10:20 AM
Abstract
Reproducibility of computational workflows is an important challenge that calls for open and reusable code and data, well-documented workflows, and controlled environments that allow others to verify published findings. HydroShare (http://www.hydroshare.org) and GeoTrust (http://geotrusthub.org/), two new cyberinfrastructure tools under active development, can be used to improve reproducibility in computational hydrology. HydroShare is a web-based system for sharing hydrologic data and model resources. HydroShare offers hydrologists the capability to upload model input data as resources, add hydrologic-specific metadata to these resources, and use the data directly within HydroShare for collaborative modeling using tools like JupyterHub. GeoTrust provides tools for scientists to efficiently reproduce, track and share geoscience applications by building ‘sciunit,’ which are efficient, lightweight, self-contained packages of computational experiments that can be guaranteed to repeat or reproduce regardless of deployment challenges. We will present a use case example focusing on a workflow that uses the MODFLOW model to demonstrate how HydroShare and GeoTrust can be integrated to easily and efficiently reproduce computational workflows. This use case example automates pre-processing of model inputs, model execution, and post-processing of model output. This work demonstrates how the integration of HydroShare and Geotrust ensures the logical and physical preservation of computation workflows and that reproducibility can be achieved by replicating the original sciunit, modifying it to produce a new sciunit and finally, preserving and sharing the newly created sciunit by using HydroShare's JupyterHub.
Achieving Reproducible Computational Hydrologic Models by Integrating Scientific Cyberinfrastructures
Reproducibility of computational workflows is an important challenge that calls for open and reusable code and data, well-documented workflows, and controlled environments that allow others to verify published findings. HydroShare (http://www.hydroshare.org) and GeoTrust (http://geotrusthub.org/), two new cyberinfrastructure tools under active development, can be used to improve reproducibility in computational hydrology. HydroShare is a web-based system for sharing hydrologic data and model resources. HydroShare offers hydrologists the capability to upload model input data as resources, add hydrologic-specific metadata to these resources, and use the data directly within HydroShare for collaborative modeling using tools like JupyterHub. GeoTrust provides tools for scientists to efficiently reproduce, track and share geoscience applications by building ‘sciunit,’ which are efficient, lightweight, self-contained packages of computational experiments that can be guaranteed to repeat or reproduce regardless of deployment challenges. We will present a use case example focusing on a workflow that uses the MODFLOW model to demonstrate how HydroShare and GeoTrust can be integrated to easily and efficiently reproduce computational workflows. This use case example automates pre-processing of model inputs, model execution, and post-processing of model output. This work demonstrates how the integration of HydroShare and Geotrust ensures the logical and physical preservation of computation workflows and that reproducibility can be achieved by replicating the original sciunit, modifying it to produce a new sciunit and finally, preserving and sharing the newly created sciunit by using HydroShare's JupyterHub.
Stream and Session
F4: Replicability and Reproducibility in Research: From Vaporware to Software in Environmental Computing