Subscribe to RSS Feed (Opens in New Window)
2018 | ||
Monday, June 25th | ||
9:00 AM |
A Simplified Approach for Water Resources Web Processing Services (WPS) Development Xiaohui Qiao, Brigham Young University 9:00 AM - 10:20 AM Developing a complex water resources modelling web application can be a daunting task that requires integration of various models and data sources with ever-changing internet technologies. Service-oriented architecture (SOA) has been shown to be useful for building complex modelling workflows. However, compared with other types of web services such as for data delivery and mapping, the implementation of web processing services (WPS) for water resources modelling and data analysis is not very common. Indeed, tools to simplify the development and deployment of WPS for general modelling cases are lacking. We will present the development and testing of a ready-touse WPS implementation called Tethys WPS Server, which provides a formalized way to expose web application functionality as standardized WPSs in alongside an app’s graphical user interfaces. Our WPS server is Python-based and is created on Tethys Platform by leveraging PyWPS. A case study is provided to demonstrate how web app functionality(s) can be exposed as WPS using our open source package, and show how these WPSs can be coupled to build a complex-modelling app. The advantages of Tethys WPS Server includes: 1) lowering the barrier to OGC WPS development and deployment, 2) providing web services-based access of apps, 3) improving app interoperability and reusability, and facilitate complex modelling implementation. |
|
---|---|---|
9:00 AM |
Dali Wang, Oak Ridge National Laboratory 9:00 AM - 10:20 AM The complexity of large scientific models developed for certain machine architectures and application requirements has become a real barrier that impedes continuous software development. In this study, we use experience from several practices, including open-source software engineering, software dependency understanding, compiler technologies, analytical performance modeling, micro-benchmarks, and functional unit testing, to design software toolkits to enhance software productivity and performance. Our software tools collect the information on scientific codes and extract the common features of these codes. In this paper, we focus on the front-end of our system (Software X-ray Scanner): a metric information collection system for better understanding of key scientific functions and associated dependency. We use several science codes from the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program, Exascale Computing Projects (ECPs), Subsurface Biogeochemical Research (SBR) to explore cost-efficient approaches for program understanding and code refactoring. The toolkits increase the software productivity for the Interoperable Design of Extreme-scale Application Software (IDEAS) community which is supported by both US Department of Energy’s Advanced Scientific Computing Research (ASCR) and Biological and Environmental Research (BER) programs. We expect that these toolkits can benefit broader scientific communities that are facing similar challenges. |
|
9:00 AM |
Rohit Khattar, Brigham Young University 9:00 AM - 10:20 AM GIS-enabled web applications for environmental data management and modelling are gaining momentum as web technologies and cloud storage become less expensive and more readily accessible. This has resulted in an increase in the number of published GIS-enabled web applications that allow policy makers and stake holders to perform otherwise complex analyses by simply visiting a website and making a few clicks. By developing an Web App Nursery i.e., a sandbox environment to safely develop, test and deploy GIS-enabled web applications for environmental data management and modelling, our goal is to significantly simplify developing, testing, deploying and sharing of such tools. This NASA GEOGLOWS project addresses the growing need for flexible data analysis and modelling environments that provides users with the ability to explore, analyze, and model earth observation data in a software-as-a-service, web-based environment. We are using the Tethys Platform cyberinfrastructure – a set of open source GIS and web development tools – to create a warehouse for rapid deployment of open source hydroinformatics apps for managing and using essential water resources variables in support of the GEOGLOWS and other GEO Work Programme elements. The App Nursery will be deployed as part of the HydroShare data sharing project and will allow third party developers to create, test, and share web based apps in a safe environment. Ultimately our intention is to use this web app nursery to foster a community of app developers in the environmental data management and modelling domain who will share apps through a curated “app store”. This presentation will present the nascent App Nursery, including demonstration of existing apps and the use of Docker, Django, GeoServer, and OpenLayers to rapidly create and deploy GIS-enabled web apps for environmental data management and modelling. |
|
9:00 AM |
Dr. Md. Nazrul Islam, Associate Professor, Department of Geography and Environment, Jahangirnagar University, Savar, Dhaka-1342, Bangladesh 9:00 AM - 10:20 AM A three dimensional Marine Environmental Committee (MEC) model was conducted to describe the specific circulation patterns of currents, temperature, and salinity driven by wind and tide forcing in Kamaishi Bay at Miyagi Prefecture in the Great East Japan. The major concern of this study is the diffusion of pollutants caused by 2011 Earthquake and Tsunami disaster impacts on marine ecosystem. In this study, we also simulate the changes of water quality and ecosystems structure from January 2009 to December 2012. The MEC model has been used to predict the distributions of various key water quality indicators and tide flow in the different layer of Kamaishi Bay. High correlation is obtained between simulation derived and measurement derived tidal characteristics. We also simulated the effects of breaking water effects on the tide, currents and integrating aquaculture and fisheries. The wind driven flow using mean seasonal wind forcing (NE, SE, and SW) creates different circulations over Kamaishi Bay. The current variability in shallow areas is influenced by the prevailing winds. Similarly, the temperature and salinity distribution of Kamaishi Bay waters is characterized by strong seasonal variations. The water quality is intensely affected by pollutants and has continually deteriorated due to increased discharges of domestic and industrial waste as well as an increased loading in anthropogenic contamination into the Bay. The results were found that measured and simulated contaminations of pollutants were under the environmental standards in Japan. Observed and simulated DO, T-N and T-P concentrations were not so large different from those before the disaster. |
|
10:40 AM |
A Python Package for Computing Error Metrics for Observed and Predicted Time Series Wade Roberts, Brigham Young University 10:40 AM - 12:00 PM Error metrics are statistical measures used to quantify the error or bias of forecasted model data compared to observed data. Error metrics are used extensively in water resource engineering when evaluating hydrologic models to determine the accuracy and applicability of the model. The literature reports a large number of error metrics, however, it is not always clear which metric to use and which metrics are applicable to time series data as different metrics highlight different biases or errors. We created a Python package for hydrologic time series data with over 50 different commonly used error metric functions as well as visualization and data management tools. The functions include error checks to make sure that the input data meets requirements and will return real values The package includes references, explanations, and source code. In this paper we provide an introduction the package, including descriptions of the error metrics implemented and recommendations for use along with examples of use with sample data. |
|
10:40 AM |
AgroDataCube and AgInfra Plus: Operationalising Big Data for Agricultural Informatics Rob Knapen, Wageningen University and Research 10:40 AM - 12:00 PM Big Data methods and tools are becoming widely adopted by the ICT industry and create new opportunities for data intensive science in the agro-environmental domain. However, Big Data adoption is still in its infancy for Agricultural Information Systems, and many barriers still exist for wider use of big data analysis in agricultural research. Besides, essentially collections of Big Data for agriculture are currently largely missing, lowering the possibilities to use big data analytics based on machine learning techniques for agriculture. The AgroDataCube strives to break through this lock-in situation by providing a reference data warehouse for working with a number of large spatial open datasets, relevant to agriculture, to researchers, practitioners and industry. It is developed and tested iteratively by promoting it in a number of FarmHacks, hackathons that specifically target the use of open data and open source in the agro-environmental domain. Furthermore, two possible Use Cases for more data-driven agriculture will be explored in the AgInfra Plus European research project. AgInfra Plus is the testbed sister project to eRosa, a project defining a roadmap for the use of e-Infrastructure in agricultural research. A use case on crop modelling will explore the use of virtual research environments and cluster computing for crop simulation, while the other use case will look into crop phenology estimation and prediction. This paper will give an overview of the ongoing work on AgroDataCube and AgInfra Plus, describe bottlenecks encountered so far and paths taken onto enabling these exciting new possibilities for smart agriculture. |
|
10:40 AM |
Scalable Big Data Platform, Mining and Analytics Services for Optimized Forecast of Animals Habitats Zoheir Sabeur Dr, University of Southampton, IT Innovation Centre, School of Electronics and Computer Science 10:40 AM - 12:00 PM The effects of climate change have been observed for decades now that we can access to multiple methods of Earth Observation (EO) using in situ, air-borne and space-borne sensing. The generated EO Big Data from these sources is of paramount importance for scientists to understand the effects of climate change and the specific engendered natural (and anthropogenic) processes that are likely to trigger the changing behaviour of species on Earth. In the EO4wildlife project (http://www.copernicus.eu/projects/eo4wildlife), we have access to Copernicus and Argos EO Big Data for investigating the changes of habitats for a variety of marine species. The challenge is to forecast the habitats by identifying the causal relationships between animal presence and Metocean environmental fronts. This is achieved by processing data of animal presence, which are relatively small in size and sparse, and their correlation with environmental datasets, which are large and dense in feature space. This poses big data challenges in terms of optimisation of resources, mining and feature selections. Once overcome, it improves the performance of the forecasting models. The availability of big geospatial information, satellite data and in situ observations enabled us experiment on the scalability of our distributed data storage technologies and analytics services in the cloud. We specifically deployed cluster infrastructure via Spark for a resilient distribution of processing over multiple nodes. The testbed experiments of our big data processing performance is validated under three types of selected habitat forecasting workflows. These will be described in details in the completed version of this paper. |
|
10:40 AM |
Shipping environmental software as R packages Brad Eck 10:40 AM - 12:00 PM This paper shares recent experience of packaging an existing open-source simulation engine for use in the R environment. R has become popular in many sectors, including environmental analysis, and the number of packages providing add-on functionality continues to grow rapidly. R packages conform to a particular structure and so have common attributes with respect to reuse and interoperability. These features made R a good fit for the four goals of our project: (1) The software should be straightforward to obtain and operate on several computing platforms especially Windows, Mac and Linux. (2) To drive re-use, the software should have documentation covering all of the user visible functions and including some examples. (3) The software should be obtainable and useable in several commercial cloud computing environments. (4) Finally, it should be possible to achieve some degree of parallelization of simulations. The process of packaging a simulation engine for R revealed several lessons for the practice of environmental software development. We share those lessons and make an assessment of the merits and limitations of shipping environmental software for R. |
|
2:00 PM |
Benchmarking Apache Spark spatial libraries Hector Muro Mauri, Wageningen University and Research 2:00 PM - 3:20 PM Apache Spark is one of the most widely used and fast-evolving cluster-computing frame- works for big data. This research investigates the state of practice in the Apache Spark ecosystem for managing spatial data, with a specific focus on spatial vector data. Apache Spark is a relatively new platform, and the associated libraries for geospatial data extensions are still work-in-progress. In this work, three libraries for managing geospatial information in Apache Spark have been investigated, namely GeoSpark, GeoPySpark, and Magellan. First we designed and performed a suite of functionality tests, to explore how much can be done with. Then, we benchmarked the performance of the libraries for executing common spatial tasks using annoyingly big geospatial datasets. Finally, we compare the performance of the three libraries in contrast to a traditional Geographic Information System that uses a relational database for storage. Our findings about the maturity of the libraries and the scalability of solutions in Apache Spark are mixed, as key functionalities are still missing, but gains in the elapsed real time to respond to queries can be up to two orders of magnitude faster. |
|
2:00 PM |
Easy to Use Workflows for Catchment Modelling: Towards Reproducible Model Studies Celray James CHAWANDA, Vrije Universiteit Brussel 2:00 PM - 3:20 PM Catchment scale hydrological models have a variety of users with different technical backgrounds. These users often need to adapt their model before it can be applied to their case study. To this end, most catchment models use a Graphical User Interface (GUI) to allow direct manipulation of the models. While a GUI is generally easy to use for novice users, it opens many sources of irreproducible research in the scientific community. Here we present a workflow for the Soil and Water Assessment Tool (SWAT) that promotes reproducible model studies while remaining user-friendly for both novice and expert users. The python-based wrapper uses pre-processed input data and a namelist file to build the QSWAT model and run it without further user interaction. We then apply this environment to the Blue Nile catchment and show that it yields almost the exact same results as building the QSWAT model through the GUI. Our results indicate benefits using the automated workflow over the GUI in reproducing earlier results and implementing changes to an existing set-up while saving time in the model building process. All the while, the model configuration can still be viewed and modified in the GUI. We conclude that workflows can help reduce cases of irreproducible research in catchment modelling and offer benefits for researchers building upon existing model configurations. Workflows also open up the opportunities for using high performance infrastructure for large catchment model setup without losing interoperability with GUIs. (This workflow is publicly available on GitHub: https://github.com/VUB-HYDR/2018_Chawanda_etal_EMS) |
|
2:00 PM |
Michael Berg-Mohnicke, Leibniz Centre for Agricultural Landscape Research 2:00 PM - 3:20 PM In the field of agricultural and environmental research different kinds of models are in use and current efforts are undertaken to assemble them into ever more complex systems to support decision making, evaluate climate change adaptation strategies or learn about possible impacts of land-use change. While in the past these systems were usually built in a monolithic manner, the connected world of today enables us to rethink this approach. Using two examples from the agricultural domain, we show how complex systems can be flexibly designed by integrating models via message passing interfaces. We discuss our experience in using the ZeroMQ library to enable our scientists to run large regional-scale simulations on high-performance computers, quickly comply with simulation protocols, meet deadlines and still being able to debug their software. A second example demonstrates how the same architecture can be used to couple a complex agro-ecosystem model (MONICA) to an irrigation advisory system (WEB-BEREST). Looking into the future and extrapolating the consequences of applying the message passing/flow-based paradigm into the field of environmental software, we identify opportunities for improved scientific cooperation between research institutions and working groups. |
|
3:40 PM |
Agricultural Model Exchange Initiative (AMEI) Andreas Enders, Universitat Bonn 3:40 PM - 5:00 PM Model development of managed environmental systems and in particular agricultural systems is complex and driven by both biophysical and socio-economic processes. Additional complexity is created reflecting context- and scale-dependency of the main drivers. It has to offer to the scientist the possibility to create highly diverse models (modelling solutions) combining model components from different domain seamlessly. The AgMIP initiative could show it is not sufficient to run one model to estimate changes in agricultural systems. The AMEI aims to rise to different challenges exchanging model components by - defining standards to describe model component exchange format specifically - developing a (web)-platform to publish, cite and exchange code and model algorithms - checking and publishing different levels of quality in the documentation of the included algorithms - including unit tests and standard parametrizations The author’s organizations have invested in the last years to enable their modelling platforms to interact with this exchange approach. This is done by integrating wrappers and/or component import export converters. Recently interacting partner platforms are: APSIM, BioMA, CropSyst, DSSAT, OpenAlea, RECORD, SIMPLACE, SiriusQuality, STICS The talk will provides in sociological, scientific and technical terms a conceptual overview in the state of their work. The presenter will give practical examples of successful component exchange within different frameworks and will give opportunity to integrate in the AMEI group. |
|
3:40 PM |
Solar Energy Potential Assessment for entire Cities - a reusable and scalable approach Sukriti Bhattacharya, Luxembourg Institute of Science and Technology 3:40 PM - 5:00 PM The recent technological advances in massive geospatial data collection assessing both the temporal and spatial dimensions of data add significant complexity to the data analysis process, provides new dimensions for data interpretation. Accordingly, geographical information systems (GIS) must evolve to represent, access, analyze and visualize big spatiotemporal data in a scalable integrated way. Often, sharing and transferring of such information through deep-dive and automated analysis cause scalability challenges at the software level that impacts the overall performance, throughput, and other performance parameters. In this paper, we demonstrated the whole implementation explaining some practical steps to scale solar irradiation calculations for entire cities at very high space-time resolution by using scalable tensor data structure and inherent parallelism offered by data-flow based implementation. We attempt to improve the understanding of the underlying equations and data structures from an analytical, a geometric and a dynamical systems perspective. The entire model is implemented in Tensorflow, an open source software library developed by the Google Brain Team using data flow graphs and the tensor data structure. To assess the performance and accuracy of our TensorFlow based implementation we compared to the well known r.sun from GRASS GIS and PVLIB from National Renewable Energy Laboratories (USA) implementation for solar irradiation simulations. Results show that we achieved noticeable and significant improvements in overall performance keeping accuracy at negligible differences. |
|
Tuesday, June 26th | ||
9:00 AM |
Gregory E. Tucker, University of Colorado Boulder 9:00 AM - 10:20 AM Modeling software expresses, in numerical form, our ideas about how environmental systems work. The codes we use to express these ideas should ideally be flexible enough to evolve as the ideas themselves evolve, but all too often the software engineering becomes a barrier to progress. In this paper, we present the design concepts behind Landlab Toolkit, which is a Python programming library intended to speed the process of creating and modifying two-dimensional, grid-based numerical models. Landlab’s design goals included: (1) making it simple to create and configure 2D grids of various types, (2) supporting re-usable components, (3) allowing multiple components to share a common grid and data arrays, (4) operating in a high-level, open-source language that offers a rich set of libraries. The first goal is met by providing a several grid classes from which a grid of desired scale and dimensions can be constructed. A grid object contains a set of graph elements (such as nodes, links, and patches) along with data structures that describe the connectivity among them. To meet goal 2, Landlab uses a standard design for components, which are also implemented as classes. Goal 3 is met by allowing components to attach fields to the grid, where a field is a data array that is tied to a particular type of grid element. Finally, implementing Landlab in Python achieves Goal 4. To illustrate how this functionality works in practice, we present several examples of Landlab-built models, including applications in overland-flow dynamics, landform evolution, and cellular automata. |
|
9:00 AM |
Geoscience model coupling in a Python framework: PyMT Eric Hutton, University of Colorado Boulder 9:00 AM - 10:20 AM The current landscape of geoscience models is broad not only in scientific scope, but also in type. On one hand, the variety of models is exciting, as it provides fertile ground for extending or linking models to answer scientific questions. On the other hand, models are written in a variety of programming languages, operate on different grids, use their own file formats (both for input and output), have different user interfaces, have their own time steps, etc; each of these factors become obstructions to scientists wanting to couple, extend--or simply run--existing models. And all this is before the scientific difficulties of coupling or running models are addressed. The Community Surface Dynamics Modeling System (CSDMS) is developing the Python Modeling Toolkit (PyMT) to help non-computer scientists deal with these sorts of modeling logistics. PyMT is the fundamental package CSDMS uses for running and coupling models that expose a Basic Modeling Interface (BMI). PyMT contains:
Here, we introduce the basics of current beta version of PyMT and provide an example of coupling models of different domains and grid types. |
|
9:00 AM |
R and Python Annotation Bindings for OMS Francesco Serafin, University of Trento 9:00 AM - 10:20 AM OMS3 is an environmental modeling framework designed to support and ease scientific environmental models development. It is implemented in Java, programming language that makes the framework flexible and non-invasive. Java is consequently the natural language for developing OMS-compliant components. However, OMS3 ensures longevity of old models implementations providing C/C++ and Fortran bindings that allow for connecting slightly modified legacy software to fresh developed Java components. Recently, three scientific programming languages drew the modeling community’s attention: R and Python, and NetLogo. They have a flat learning curve and the lack of declared data types makes them the flawless solution for fast scripting. Furthermore, they rely on a active developer community that keep releasing and improving open source scientific packages. This is a relevant aspect when it comes to facilitate and speed up the implementation of scientific algorithms. OMS3 functionalities have been enhanced to provide R, Python, and NetLogo bindings. As a result, multi-language modeling solutions are fully interoperable. Thanks to the framework non-invasiveness, R, Python and NetLogo scripts must only be slightly modified with source code annotations to become OMS-compliant components. The resulting components are nevertheless still executable from within the original environments. This contribution shows two actual applications of R and Python bindings: the Regional Urban Growth (RUG) implemented in R and TRansportation ANalysis SIMulation System (TRANSIMS) models which requires RTE python module. To avoid the burden of installing required software stacks, OMS3 has been bundled into a Docker image.The enhanced flexibility in the workflow is established. |
|
9:00 AM |
Software Development Best Practices in Integrated Environmental Model Development Takuya Iwanaga, Australian National University 9:00 AM - 10:20 AM Integrated models are often made up of smaller component models, each representing a particular domain that are coupled together. Such models are software for a scientific purpose, and so similarities between model and software development exist. These models tend to be developed by researchers who take on a dual-role of scientist and software developer. Despite the similarities in development approaches, many best practices found within the field of software engineering may not be applied. These lead to issues revolving around the reusability, interoperability, and reliability (in terms of model application and results) of models developed for integrated assessment. To address these concerns recent efforts have seen the development and proliferation of component-based implementation approaches, repositories that act as a store of reusable models, and model development frameworks. These approaches by themselves are not a panacea to model development issues. Component models may be difficult to integrate and error-prone due to the aforementioned lack of best practices. The structure of component and integrated models found in repositories may render them difficult to reuse/reapply in a different context. Model development frameworks may ease the overall technical burden of model development and integration, however they often come with a steep learning curve of their own, which may hamper their effective use. This may in turn exacerbate issues regarding model reusability and interoperability. In this paper we suggest some guidelines and general directions identified and supported through literature review and expert knowledge to fill in the gap between software and modelling paradigms. |
|
10:40 AM |
A multi-level decision support system for energy optimization in WWTPs DARIO TORREGROSSA, Luxembourg Institute of Science and Technology 10:40 AM - 12:00 PM The availability of real-time measurements in Wastewater Treatment Plants (WWTPs) can produce environmental and economic benefits. Since a WWTP can produce up to 300k records per day, computational analytics support is necessary for efficient decision-making. Recently a Shared Knowledge Decision Support System (SK-DSS) was presented with specific applications for energy saving in pumps and blowers. The SK-DSS is based on fuzzy analysis, identifies the operational conditions of devices and provides case-based solutions. With a large number of monitored devices, it is necessary to provide a global synthetic index, able to represent the performance of the plant and the different importance of devices. In this paper, the global index has been proposed and calculations performed with a multi-level fuzzy logic engine. In the bottom layer of this multi-level fuzzy logic engine, pumps and blowers are individually assessed. The top layer allows the calculation of a score in the range [0-100] by processing the outputs of the individual device assessments without losing the detailed information stored at the bottom level. Different weights are attributed to devices in the calibration of the top-layer fuzzification process. The output results are visualized to better identify the source of inefficiency. Results show the potential of such indicators with a larger number of plant devices and, in future, the global index will trigger an alert-system for plant manager. |
|
10:40 AM |
A Semantic Model Catalog to Support Comparison and Reuse Daniel Garijo, University of Southern California 10:40 AM - 12:00 PM OBJECTIVES: Model repositories are key resources for scientists in terms of model discovery and reuse, but do not focus on important tasks such as model comparison and composition. Model repositories do not typically capture important comparative metadata to describe assumptions and model variables that enable a scientist to discern which models would be better for their purposes. Furthermore, once a scientist selects a model from a repository it takes significant effort to understand and use the model. Our goal is to develop model repositories with machine-actionable model metadata that can be used to provide intelligent assistance to scientists in model selection and reuse. METHODOLOGY: We are extending the OntoSoft semantic software metadata registry (http://www.ontosoft.org/) to include machine-readable metadata. This work includes: 1) exposing model variables and their relationships; 2) adopting a standardized representation of model variables based on the conventions of the Geoscience Standard Names ontology (GSN) (http://www.geoscienceontology.org/); 3) capturing the semantic structure of model invocation signatures based on functional inputs and outputs and their correspondence to model variables; 4) associating models with readily reusable workflow fragments for data preparation, model calibration, and visualization of results. FINDINGS: We have extended OntoSoft to expose model variables and adopt GSN ontologies to describe hydrology models. We are designing representations to capture the semantic structure of model invocation signatures that maps model variables to data requirements to facilitate discovery and comparison of models. SIGNIFICANCE: The extended OntoSoft framework would reduce the time to find, understand, compare and reuse models. |
|
10:40 AM |
Imeshi N. WEERASINGHE, Vrije Universiteit Brussel 10:40 AM - 12:00 PM Complex agro-environmental models have a large number of parameters which are problematic during calibration. A sensitivity analysis can help identify the most sensitive parameters which should be included in the calibration process. Depending on the number of parameters being analysed and the method used, the required number of simulations varies from moderate to very large. Consequently, the number of simulations, number of years run and size of the model affect the computation power and thus, computer time required. This often limits the number of parameters and/or number of simulations, hence the robustness of the analysis. A possible solution is to conduct an initial screening of parameters using a method that requires fewer simulations and therefore can include more parameters. Subsequently, a more robust method can be performed on the obtained fewer sensitive parameters using substantially more simulation runs. Additionally, cloud computing and parallelisation can be used to reduce the computation time taken. SWAT, a complex hydrological model, has a large number of parameters that influence Water Productivity (WP) estimations, defined as the ratio of production (calculated as biomass increment or agricultural crop yield) over water consumption (calculated as evapotranspiration). To aid the calibration process, a sensitivity analyses for WP variables at the basin scale was conducted investigating the possible benefits of using: a two-step method with an initial screening (LH-OAT) prior to running a more advanced quantitative method (SOBOL); parallelisation and cloud computing. Initial results indicate substantial time benefits using the two-step method including parallelisation and cloud computing. |
|
10:40 AM |
MINT: Model INTegration Through Knowledge-Powered Data and Process Composition Yolanda Gil, University of Southern California 10:40 AM - 12:00 PM Major societal and environmental challenges require forecasting how natural processes and human activities affect one another. Model integration across natural and social science disciplines to study these problems requires resolving semantic, spatio-temporal, and execution mismatches, which are largely done by hand today and may take more than two years of human effort. We are developing the Model INTegration (MINT) framework that incorporates extensive knowledge about models and data, with several innovative components: 1) New principle-based ontology generation tools for modeling variables, used to describe models and data; 2) A novel workflow system that selects relevant models from a curated registry and uses abductive reasoning to hypothesize new models and data transformation steps; 3) A new data discovery and integration framework that finds and categorizes new sources of data, learns to extract information from both online sources and remote sensing data, and transforms the data into the format required by the models; 4) New knowledge-guided machine learning algorithms for model parameterization to improve accuracy and estimate uncertainty; 5) A novel framework for multi-modal scalable workflow execution. We are beginning to annotate models and datasets using standard ontologies, and to compose and execute workflows of models that span climate, hydrology, agriculture, and economics. We are building on many previously existing tools, including CSDMS, BMI, GSN, WINGS, Pegasus, Karma, and GOPHER. Rapid model integration would enable efficient and comprehensive coupled human and natural system modeling. |
|
10:40 AM |
Scott Dale Peckham, University of Colorado, Boulder 10:40 AM - 12:00 PM Every data set and computer model has its own internal vocabulary (i.e. names or labels) for referring to its input and/or output variables. It is therefore difficult to know, even for an expert, whether a variable stored/computed in a given digital resource is equivalent to one needed by another resource. Experts can typically figure this out through a process that may involve examining the equations that are used, being familiar with domain jargon, reading documentation (e.g. source code, manuals and papers) or talking to the developer of the resource. However, this is time-consuming, frustrating and inefficient. The only way to automate this semantic mediation task is with an accurate, one-time mapping of these internal names to variable names in a standardized vocabulary that can be utilized by machine (i.e. accessed via function calls in a program). This task of mapping internal variable names to standardized names is known as "semantic annotation". Once completed, it is possible to automatically perform "semantic alignment" every time that resource is selected for use in a workflow, allowing variables to be correctly passed between coupled resources. We will describe efforts to semi-automatically generate standardized variable names for different domains by building on the foundational and rule-based principles of the Geoscience Standard Names ontology (geoscienceontology.org). Our initial focus will be on measurement concepts in the realms of agriculture, social science, economics, transportation networks and demographics. This work is funded by a project called MINT (Model INTegration) that is part of the World Modelers program. |
|
10:40 AM |
Rafael Ferreira da Silva 10:40 AM - 12:00 PM Workflows provide a solid foundation to address model integration challenges. Integrated models may be simply chained, or they may need to run in an interleaved (tightly-coupled) fashion. Data exchange formats may significantly differ (e.g., scale), and data transformations may be required to convert available data into the formats required by the models. In this work, we are creating the MINT (Modeling INTegration) environment for workflow composition and execution by extending the well-established workflow composition (WINGS) and execution (Pegasus) systems with a framework for model coupling for execution interleaving (EMELI/BMI). WINGS provides a semantic workflow system that can represent and propagate constraints to validate workflows, while Pegasus enables distributed workflow execution across infrastructures and provides automated data management and fault-tolerance. BMI provides standardized, noninvasive, and framework-independent API for models. Models for integration will be selected from a Model Catalog based on variables of interest (and built on ontologies of standard variable names). Via abductive reasoning, MINT will assess the viability of workflows by hypothesizing data transformation tasks for converting available data into the formats required by the models. Data transformation services will generate multi-step scripts for accommodating the hypothesized data transformation tasks. MINT’s multi-method scalable model execution will then enact the execution of the tight model coupling (using EMELI/BMI) and independent model chaining applying, when needed, the required transformations. The MINT integrated modeling environment would facilitate and accelerate modeling analysis by generating new data transformations via abductive reasoning, and by providing scalable execution of chaining or tightly-coupled models. |
|
10:40 AM |
Using the LASSO to understand groundwater residence times Jeff Jeffrey Starn, USGS NAWQA Program, East Hartford, CT 10:40 AM - 12:00 PM Groundwater residence-time distributions (RTDs) are critical for understanding lag times between recharge at the water table and base flow in streams. However, RTDs cannot be measured directly—they must be inferred from an analysis of data using models. Glacial aquifers present challenges to modeling approaches because they are spatially discontinuous and have highly variable properties. An innovative approach by the USGS uses machine learning in conjunction with numerical models that results in a rapid and robust way of generating RTDs. To demonstrate the method, computer programs were used to automatically create generalized finite-difference groundwater flow models in 30 watersheds across the northeastern glaciated U.S. RTDs were calculated from these models using flux-weighted particle tracking. Targets for machine learning were created from the simulated RTDs by fitting 3-parameter Weibull distributions. A form of penalized linear regression called Multitask LASSO (Least Absolute Shrinkage and Selection Operator) regression was trained on the Weibull parameters using hydrogeographic variables of the modeled domains as explanatory features. Because LASSO features are standardized, coefficient magnitudes can be compared to determine the relative importance of the features. Multitask LASSO was used to estimate the three Weibull parameters simultaneously, thus ensuring that the same features were used to estimate all of the parameters. The results show that aquifer heterogeneity and exchange of water between glacial deposits and bedrock and surface water are important for estimating RTDs. The quantitative understanding gained from the LASSO permits RTDs to be estimated across the glaciated region. |
|
2:00 PM |
Faizal Rohmat, Colorado State University 2:00 PM - 3:20 PM The productivity of irrigated agriculture in Colorado’s Lower Arkansas River Basin (LARB), along with similar basins throughout the western U.S., is threatened by salinity and water logging problems resulting in reduced crop yields and abandoned cropland. In addition, over-irrigation and seepage from unlined canals has resulted in elevated concentrations of nutrients and trace elements, such as selenium, from underlying marine shales into groundwater and the river that exceed environmental standards. Intensive data collection and modeling efforts by Colorado State University in the LARB over the past 20 years have resulted in development of the river basin management model GeoMODSIM, along with calibrated, spatially-distributed regional-scale groundwater modeling based on MODFLOW-UZF for evaluation of best management practices (BMPs) for improving water quality and boosting productivity. GeoMODSIM simulates basin-wide water management strategies to offset impacts of altered return flow patterns resulting from BMP implementation. This is required to insure compliance with Colorado water right priorities and the Colorado-Kansas interstate compact. It is essential that the model is linked with MODFLOW-UZF for accurate modeling of the complex stream-aquifer system of the LARB. Unfortunately, integration of MODFLOW-UZF with GeoMODSIM is hampered by the intense computational requirements that render direct linkage intractable. An artificial neural network (ANN) has been successfully developed, trained, and tested to serve as an accurate and computationally efficient surrogate for MODFLOW-UZF that can be directly linked with GeoMODSIM. This permits assessment of basin-scale impacts of various BMP scenarios using input-output datasets generated from numerous MODFLOW-UZF simulations in the LARB. |
|
2:00 PM |
Tomasz E. Koralewski, Texas A&M University 2:00 PM - 3:20 PM Specific question-driven ecological models often require representation of general physical environmental processes for which complex, widely-accepted meteorological, hydrological, and oceanographic models are available. Although conceptual coupling of physical environmental and ecological models is straightforward, computational linkages often pose insurmountable problems to ecological modelers with limited access to software engineering expertise. Establishing an ongoing dialogue during the course of a simulation between georeferenced physical environmental models and spatially-explicit ecological models can be particularly problematic. We describe a general coupling framework that allows for a modular structure through an intermediate layer between the existing physical models and the custom-written spatially-explicit ecological models. We demonstrate the applicability of this general framework by computationally linking HYSPLIT with a spatially-explicit ecological model implemented in NetLogo. HYSPLIT is a widely-used complex meteorological model whose applications include simulation of air particle transport, dispersion, and deposition. With reference to ecological applications, the “air particles” can represent air-borne insects such as aphids. NetLogo is a popular programming platform for spatially-explicit, individual-based ecological modelling which we have coupled with HYSPLIT to simulate regional aphid population growth and spread. We describe a custom-written program that facilitates an ongoing dialogue between these two models. The dialogue takes place along a temporal scale, on a daily basis, for a period of time constrained only by the study objective. The program should be readily adaptable to coupling other sets of physical environmental and ecological models with adjustments for the used programming language and the specificity of the input/output files. |
|
2:00 PM |
Data and model framework for a community Industrial Ecology Socio-environmental Model Christopher Mutel 2:00 PM - 3:20 PM Industrial ecology is a diverse community covering many research areas and application domains; in some areas, such as Life Cycle Assessment (LCA), common databases and data formats are widely used, while in other areas, like Material Flow Assessment (MFA), there are no common data formats or databases. Recent work has shown that there is a common underlying knowledge model across most industrial ecology domains. This common socio-economic metabolism is a spatially- and temporally-resolved graph of product and service flows throughout the economy, including into and out of stocks. In this presentation, we review previous work to develop a common ontology for LCA and MFA, and describe a draft simple common format and ontology for industrial ecology data using JSON linked data. We demonstrate how this format can be applied to existing data sources, and how combining a common ontology with existing common nomenclature systems can lead to a radical reduction in the effort needed to share data. While further effort is needed to create a complete data format, including e.g. material properties and details on data entry and review, our simple data format can already be used in open source software such as Brightway (https://brightwaylca.org/). |
|
2:00 PM |
Framework-enabled Meta-Modeling Francesco Serafin, University of Trento 2:00 PM - 3:20 PM Applications of physically-based environmental models originating from research should be ubiquitous to use in both research and planning/consulting environments. However, due to their complexity, data resolution requirements, parameter number, platform affinity, and other criteria they are rarely suited “out-of the box” for field applications. Results from physically-based models are considered the most accurate but operating an entire system requires dedicated knowledge, extensive set up, and sometimes significant computational time. Questions from field applications conversely require easy to get, quick and “accurate enough” results. The use of web-services might alleviate some of the implications for model users but ultimately shift the responsibility and workload to the hosting environment. To help closing the gap between research and field models we propose a machine learning (ML)-based meta-model approach aiming to capture the intrinsic knowledge of a physical model into an ensemble system of artificial neural networks and make it available for providing simplified answers to on the field problem-specific questions. A meta modeling approach was developed to help transitioning from research to field by enabling a modeling framework to interact with ML libraries to emerge model surrogates a(ny) modelling solution. The Cloud Services Integration Platform CSIP/OMS was extended and utilized to harvest data and derive the meta-model. Here, NeuroEvolution of Augmenting Topology (NEAT) techniques in an ensemble application, combined with ANN uncertainty quantification are the main methodologies used. Two examples applications have been prototyped and will be presented, a sheet and rill erosion model and a daily runoff model. |
|
2:00 PM |
Holger R. Maier, University of Adelaide 2:00 PM - 3:20 PM The optimal long-term sequencing of water infrastructure is complicated by the need to account for uncertainty due to factors such as climate change and demographics. This requires the calculation of robustness metrics in order to assess system performance, necessitating computationally expensive simulation models to be run a large number of times within each optimisation iteration, leading to infeasible run times. In order to overcome this shortcoming, an approach is developed that uses metamodels instead of computationally expensive simulation models in robustness calculations. The approach is demonstrated for the optimal sequencing of water supply augmentation options for the southern portion of the water supply for Adelaide, South Australia. A 100-year planning horizon is subdivided into ten equal decision stages for the purpose of sequencing various water supply augmentation options, including desalination, stormwater harvesting and household rainwater tanks. The objectives include the minimization of average present value of supply augmentation costs, the minimization of average present value of greenhouse gas emissions and the maximization of supply robustness. The uncertain variables are rainfall, per capita water consumption and population. Decision variables are the implementation stages of the different water supply augmentation options. Artificial neural networks are used as metamodels to enable all objectives to be calculated in a computationally efficient manner at each of the decision stages. The results illustrate that the ANN models are able to replicate the outputs of the simulation models within 5% accuracy while reducing overall computational effort from an estimated 33.6 years to 50 hours. |
|
2:00 PM |
Ievgen Ievdin, Section SW 2.2. - Decision Support Systems, Federal Office for Radiation Protection, Neuherberg, Germany 2:00 PM - 3:20 PM Decision Support Systems for off-site emergency management in the case of a nuclear accident should integrate, among others, real-time monitoring systems around a Nuclear Power Plant, regional GIS information, source term databases and geospatial data for population and environmental characteristics. They should comprise state of the art models to simulate the fate of accidentally released radionuclides in air, water, vegetation, and soil to estimate exposure of the population via all relevant exposure pathways. The real-time online decision support system RODOS is being developed under the auspices of the European Commission’s RTD Framework programs since 1992 to achieve the above-formulated objectives. RODOS was re-engineered in the last decade as multiplatform software systemJRODOSin a Java environment. The software architecture ofJRODOSorganizes thedataflowbetween different sources and recipients, e.g., databases, numerical models, user interface, via unified data objects. These objects (data items) are organized in an expandable hierarchal tree of Java-classes using benefits of object-oriented programming principles. Numerical model integration is carried out by distributed wrapper objects (DWO), which provides logical, visual and technical integration of computational models and the system core, even if models used different programming languages such as FORTRAN, C, and JAVA. The DWO technology supports various levels of interactivity, required by different computational models including pull- and push driven chains, user interaction support, sub-models calls. The DWO and Data Item approaches are applicable for integration into DSS the sets of the different computational models, which read and produce scalars and arrays. |
|
2:00 PM |
Chenda Deng 2:00 PM - 3:20 PM Regions of irrigated farmland in the South Platte River Basin (SPRB) in northeastern Colorado recently have experienced conditions of extremely shallow water table depths (< 1 m), which have resulted in waterlogged soils and flooded basements. Reasons for rising water table elevation likely are a combination of decreased groundwater pumping as compared to previous decades, an increase in surface water irrigation, seepage from earthen irrigation canals, and the implementation of recharge ponds. The objective of this study is to assess these individual contributions and their impact on water table elevations. In the first phase, a MODFLOW model is built for the LaSalle/Gilcrest area in the SPRB. The MODFLOW model is refined to a 3-day time step, and has 10 layers that describe the geologic layering in the aquifer, with a three-dimensional map of hydraulic conductivity constructed from lithology from over 400 borehole records. The model is calibrated for the 1950-2000 time period and then tested for the 2000-2012 using observation well data. Sensitivity analysis techniques are used to determine the contribution of sources and sinks (pumping, recharge, canal seepage, etc.). In the second phase, ANN (Artificial Neural Network) is applied to learn and predict the groundwater level in the same study region. 40 ANNs are trained for 40 monitoring wells from 1950-2000 and tested with the data of 2000-2012. The results shows ANN are faster and more accurate comparing to MODFLOW. Similarity is found in the results of sensitivity analysis of both methods. |
|
3:40 PM |
Accuracy and computing speed of Earth’s Surface Modelling TianXiang Yue, Institute of Geographical Sciences and Natural Resources Research, University of Chinese Academy of Sciences 3:40 PM - 5:20 PM In terms of the fundamental theorem of Earth’s surface modelling, an Earth’s surface or a component surface of the Earth’s surface environment is uniquely defined by both extrinsic and intrinsic invariants of the surface, which can be simulated with an appropriate method for integrating the extrinsic and intrinsic invariants, such as the method for high accuracy surface modeling (HASM), when the spatial resolution of the surface is fine enough to capture the attribute(s) of interest. HASM was developed initiatively to find solutions for error problem of environmental systems modeling. However, HASM has a huge computation cost because it must use an equation set for simulating each lattice of a surface. To speed up the computation of HASM, we developed a multi-grid method of HASM (HASM-MG), a preconditioned conjugate gradient algorithm of HASM (HASM-PCG), an adaptive method of HASM, and an adjustment computation of HASM (HASMAC). Multi-grid method is the fastest numerical method for solving partial differential equations, which is based on two principles that are error smoothing and coarse grid correction. The preconditioned conjugate gradient algorithm can be developed by introducing a preconditioner to ensure faster convergence of the conjugate gradient method. The principle of the adaptive method is that grid cells where the error is large will be marked for refinement, while grid cells with a satisfied accuracy are left unchanged. The adjustment computation permits all observations to be entered into the adjustment and used simultaneously in the computations by means of least squares. |
|
3:40 PM |
Deep Reinforcement Learning for Optimal Operation of Multipurpose Reservoir Systems Matthew E. Peacock, Colorado State University - Fort Collins 3:40 PM - 5:20 PM Dynamic programming (DP) is considered the ideal optimization method for solving multipurpose reservoir system operational problems since it realistically addresses their complex nonlinear, dynamic, and stochastic characteristics. The only drawback to DP is the so-called “curse of dimensionality” that has plagued the method since its inception by Richard Bellman in the 1950’s. Dimensionality issues arise from the need to discretize the state-action space and random variates which leads to an explosion in computational and memory requirements with increased state-space dimensionality. DP also requires development of spatial-temporal stochastic hydrologic models for reservoir system operations, which may be difficult under complex climatic and meteorological conditions. A deep reinforcement learning algorithm is applied to solving DP problems for reservoir system operations which effectively overcomes dimensionality issues without requiring any model simplifications, or sacrificing any of the unique advantages of DP. The algorithm uses an iterative learning process which considers delayed rewards without requiring an explicit probabilistic model of the hydrologic processes. The algorithm is executed in a model-free stochastic environment whereby the algorithm implicitly learns the underlying stochastic behavior of the system for developing dynamic, optimal feedback operating policies. Dimensionality issues are addressed through use of accurate function approximators for the state-value and policy functions based on deep neural networks. The deep reinforcement learning algorithm is applied to developing optimal reservoir operational strategies in the Upper Russian River basin of Northern California in the presence of multiple noncommensurate objectives, including flood control, domestic and agricultural water supply, and environmental flow requirements. |
|
3:40 PM |
Caleb A. Buahin 3:40 PM - 5:00 PM Transitioning from the traditional approach of executing water resources models on single desktop computers to increasingly ubiquitous High Performance Heterogeneous Computing (HPC) infrastructure introduces efficiencies that could help advance the degree of fidelity of models to the underlying physical processes they simulate. For example, model developers may be able to incorporate more physically-based formulations, perform computations over finer spatial and temporal scales, and perform simulations that span long time periods with reasonable execution times. Additionally, computationally expensive simulations including parameter estimation, uncertainty assessment, multi-scenario evaluations, etc. may become more tractable. The use of HPC for executing these types of simulations within component-based modelling frameworks is an approach that is still largely underutilized in the water resources modeling arena. In this abstract, we describe advancements that we have implemented in the HydroCouple component-based modeling framework to allow water model developers to take advantage of heterogeneous, multi-accelerator clusters. HydroCouple largely employs the OpenMI interface definitions but adds new interfaces to better support standardized geo-temporal data structures, customizable coupled model data exchange workflows, and distributed computations on HPC infrastructure. We also describe how some of these advancements have been used to develop coupled models for two applications: 1) coupling of a one-dimensional storm sewer model with a high resolution, two-dimensional, and overland riverine model for an urban stormwater conveyance system, and 2) coupling of a series of model components being developed to simulate heat transport in heterogeneous rivers with significant longitudinal flow variability. |
|
3:40 PM |
Rajbir Parmar, U.S. Environmental Protection 3:40 PM - 5:00 PM United States Environmental Protection Agency (EPA) has developed a collection of micro services called Hydrologic Micro Services (HMS) for building hydrologic and water quality modeling workflows. HMS components are available as RESTful web services as well as desktop libraries. An HMS component may have multiple implementations addressing varying levels of underlying physical process details and assumptions. HMS components can be used in desktop and web-based workflows. A workflow can call into a specific implementation of an HMS component depending upon the details suitable for the problem statement being addressed by the workflow. Building a workflow from HMS components enables modelers to address hydrologic and water quality problem statements more precisely, in contrast to the current state of modeling where using existing models forces modelers into a potentially sub-optimal workflow. Model selection to address a problem statement has several drawbacks: the selected model may not have the appropriate level of complexity, the model may not address all parts of the problem statement without making less desirable assumptions, or the model may have more features and requirements than necessary. HMS components include data provisioning and simulation algorithms for water quantity and quality modeling. Workflows built using HMS components can in turn be used as components in larger workflows. For example, precipitation data provisioning components can download data from various data sources such as NLDAS, GLDAS, DAYMET, NCDC, PRISM, and WGEN. A simple workflow was developed as an HMS component to compare precipitation data from different sources. Comparison is performed using multiple rainfall statistics. |
|
3:40 PM |
Nexus Tools Platform: facilitating the selection of suitable nexus tools Stephan Hülsmann, United Nations University -UNU FLORES 3:40 PM - 5:00 PM Addressing integrated resources management in a cross-sectoral nexus approach requires a holistic understanding of the interlinkages of environmental processes, while also taking into consideration global change and socioeconomic aspects. Exploring these interlinkages and advancing an integrated management approach requires integrated modelling tools. However, no single modelling tool is available or conceivable that covers all processes, interactions and drivers within the nexus of water, food, energy and climate. Instead, a vast number of models are available and in use addressing particular (sets of) processes and resources. To address nexus-oriented research questions or management issues, making use of available tools and modify or couple them as required should be more efficient instead of developing a tool from scratch. For this to be possible, a database that allows the interactive comparison of such tools would be helpful. Therefore, we developed an interactive Nexus Tools Platform (NTP) for inter-model comparison, implemented as web-based database (https://data.flores.unu.edu/projects/ntp). Continually being improved and updated, NTP aims to provide detailed information on a subset (currently including 74 models, covering major aspects related to water, soil and waste management ) of a larger compilation of available modelling tools. The platform offers interactive charts and advanced search and filter functions designed to differentiate models in terms of processes, input and output parameters, application areas (countries), temporal resolution, programming language, amongst others. These functionalities provide a strong support to select the most appropriate (set of) model(s) for the specific needs. |
|
3:40 PM |
Optimizing water supply options for a region with urban-rural interactions Andre Q. Dozier Dr., Colorado State University 3:40 PM - 5:20 PM Water scarcity threatens to reduce or eliminate agricultural production from semi-arid regions with rising urban populations and environmental regulations that inhibit infrastructural investment. Although this transition in water from agricultural use to municipal use may be the economically efficient outcome for those involved with water trades, lower income rural communities are typically disproportionately damaged when benefits from sale of land and water are directed outside of rural regions. However, when benefits of sold water remain local as may be the case in the South Platte River Basin, selling more water will actually benefit local agricultural communities more than continued production. At the same time, water rights prices and treatment costs are so high that municipalities can save more money by conserving than by purchasing new supply. The municipal choice to conserve, thus, reduces costs for municipalities, but reduces financial benefits to local agricultural communities. A multiobjective framework is presented that exposes these tradeoffs inherent in water management decisions across sectors that assist regional and state water planners in identifying policy and conservation targets. Out of the five supply options within the model (agricultural water, water storage reservoirs, efficient toilets, xeriscaping, and upgraded irrigation technology), adoption of xeriscaping most significantly reduces municipal cost and irrigation technology most significantly benefits agricultural communities. Out of institutional changes considered, reducing raw water requirements saves municipalities the largest amount of money, while reducing the number of permanently fallowed cropland through deficit irrigation, permanent fallowing, or rotational fallowing provided the widest range of benefits across both sectors. |
|
3:40 PM |
Scaling vegetation dynamics: a metamodeling approach based on deep learning Werner Rammer, Institute of Silviculture, Department of Forest- and Soil Sciences , University of Natural Resources and Life Sciences (BOKU) 3:40 PM - 5:20 PM Terrestrial vegetation is of crucial importance for human well-being and provides a wide variety of ecosystem services to society. To tackle global issues such as climate change or biodiversity loss, managers increasingly demand tools that allow the prediction of vegetation dynamics at large spatial scales. While dynamic vegetation models with a faithful representation of demographic processes exist for local to landscape scale, addressing larger scales with the fine spatial grain required to answer management questions remains a challenge. We here introduce a new framework for Scaling Vegetation Dynamics (SVD) that at its core utilizes deep neural networks (DNNs). Deep Learning is an emerging branch of machine learning, currently revolutionizing computer vision, natural language processing and many other fields. In the context of SVD, a DNN learns vegetation dynamics from a high resolution process based vegetation model (PBM). Specifically, the DNN is trained to predict the probability of transitions between discrete vegetation states contingent on the current state, the residence time, environmental drivers (climate and soil conditions), and the spatial context (i.e., the state of neighboring cells). In addition, the density distributions of relevant ecosystem attributes (e.g., total ecosystem carbon or biodiversity) are derived from PBM output for each vegetation state, which allows assessing the impact of vegetation transitions on those attributes. In this contribution we introduce the conceptual approach of SVD and show results for an example application in the Austrian Alps. More generally, we discuss aspects of applying deep learning in the context of ecological modeling. |
|
3:40 PM |
Sustainable Agricultural Intensification through Crop Water and Nutrient Management Optimization A. Pouyan Nejadhashemi, Michigan State University 3:40 PM - 5:20 PM To meet the needs of the Earth’s growing population, three key challenges need to be addressed in the 21st century; namely food, energy, and water security. The demands of the growing population require that each of these sectors increase their output. However, given the interconnected nature of these sectors, sustainability is required to insure that the global demands for all three are still met while preserving the environment. In this study, optimization was used to determine when irrigation and fertilizer should be applied to maximize crop yield while minimizing environmental impacts at the farm level. To do this a multi-objective optimization technique, Non-dominated Sorting Genetic Algorithm-III (NSGA-III) and crop model, Decision Support System for Agrotechnology Transfer (DSSAT), were utilized to identify optimal solutions that represented the tradeoffs between crop yield and environmental impact. Identifying a set of possible solutions allows producers and decision makers to select the solution that best fits their situation. For this study, this technique was implemented and successfully identified irrigation schemes that reduced the water use by 50%. Given this success, the next step is to develop a web-based decision support tool that allows for the wide-spread use of the technique developed in this study by both policy makers and producers. Allowing them to perform optimizations with multiple conflicting objectives while taking into account a variety of soil, crop, and climate types. |
|
Wednesday, June 27th | ||
9:00 AM |
Building Containerized Environmental Models Using Continuous Integration with Jenkins and Kubernetes Kyle Traff, Colorado State University 9:00 AM - 10:20 AM Environmental models typically consume vast amounts of computing resources. To effectively serve a growing community of physical, social and natural scientists, these models must be able to scale dynamically and horizontally to meet the demand. Models also require a vast array of software libraries, runtimes, compilers, and configurations specific to a particular application. Maintaining arrays of physical servers, each configured for one specific application, is expensive and inefficient to build and maintain. With the advent of software containers, model developers can isolate an application and all of its software dependencies from the physical server. Kubernetes, a container orchestration tool built by Google, has made it possible to dynamically deploy these containers seamlessly across a cluster of machines. We introduce key concepts and tools for building distributed modeling systems with containers using Kubernetes, managed with a continuous integration pipeline built in Jenkins. We then build and deploy a suite of comprehensive flow analysis (CFA) models as microservices. Finally, we test the service responsiveness, throughput, and average execution time of various containerized configurations of CFA models against deployment on virtual and bare-metal machines. |
|
9:00 AM |
Quillon Harpham, HR Wallingford 9:00 AM - 10:20 AM Dengue fever occurs in 141 countries with 122,000 cases reported in Vietnam in 2016. The epidemiological situation there has been worsened by the failure of health systems to maintain adequate control of the species of mosquito that spread dengue. Several studies have emphasised the significant links between weather variability and infectious diseases, highlighting the potential for developing early warning systems for epidemics. The same methods could also be used to forecast outbreaks of zika, which has recently begun to be reported in Vietnam. This presentation describes the results of a study, supported by the UK Space Agency, resulting in a high-level method for integrating multiple stressors such as water availability, land-use and climate predictions in order to forecast future outbreaks of dengue and zika. Earth observation data can help countries understand the dynamics of these integrated stressors on the health and water sectors, especially in regions with poor or non-existent ground monitoring. However, the associated evidence base is only just emerging and applying this work using remote sensing data is expected to make a significant contribution. The resultant tools will be used to understand changing health risks at different scales under future climate change scenarios and will also include a water assessment module that will feature the additional benefit of improving water management in Vietnam’s transboundary river basins. This multidisciplinary application of open socio-environmental modelling also extends to on-the-ground practitioners tasked with acting upon the predictions in a way that will best mitigate the risks. |
|
9:00 AM |
HydroShare: A Platform for Collaborative Data and Model Sharing in Hydrology David Tarboton 9:00 AM - 10:20 AM This paper addresses the open collaborative data and model sharing opportunities offered by the HydroShare web based hydrologic information system operated by the Consortium of Universities for the Advancement of Hydrologic Science Inc. (CUAHSI). HydroShare users share and publish data and models in a variety of flexible formats, in order to make this information available in a citable, shareable and discoverable format for the advancement of hydrologic science. HydroShare includes a repository for data and models, and tools (web apps) that can act on content in HydroShare and save results back into the repository that represents a flexible web based architecture for collaborative environmental modeling research. This presentation will focus on the key functionalities of HydroShare that support web based collaborative research that is open and enhances reproducibility and trust in research finding through sharing of the data, models and scripts used to generate results. The HydroShare Jupyter Notebook app provides flexible and documentable execution of Python or R code snippets for analysis and modeling. An analysis or modelling procedure documented in a Jupyter Notebook may be saved as part of a HydroSHare resource along with the associated data, and shared with other users or groups. These users may then open the notebook to modify or add to the analysis or modelling procedure, and save results back to the same, or a new resource. Passing information back and forth this way serves to support collaboration on common data in a shared modelling platform. The Jupyter platform is embedded in high performance and data intensive cyberinfrastructure so that code blocks may include preparation and execution of advanced and data intensive models on the host infrastructure. We will discuss how these developments can be used to support collaborative research, where being web based is of value as collaborators can all have access to the same functionality regardless of their computer or location. |
|
9:00 AM |
Sam Roy 9:00 AM - 10:20 AM Dam removal is a cornerstone of environmental restoration practice in the United States. One positive outcome of dam removal is restored access to historic habitat for sea-run fish, providing a crucial gain in ecosystem resilience. But dams also provide stakeholders with valuable ecosystem services, such as municipal water storage, recreational use of lakes and rivers, property values, hydroelectricity generation, landscape nutrient and sediment flux, cultural attachments to dams, and many other river-based ecosystem services. Uncertain socio-ecological and economic outcomes can arise without carefully considering the basin scale trade-offs of dam removal. Using a combined modeling approach at watershed scales, we quantify how different dam decisions, such as removal, infrastructural improvements, management changes, or repairs, can impact the productivity of riverine ecosystem services. We identify decision scenarios that provide efficient productivity across multiple ecosystem services using a multi-objective genetic algorithm (MOGA). Production possibility frontiers (PPF) are then used to evaluate trade-offs between ecosystem services across multiple different decision scenarios. Our results suggest that for many rivers, there is potential to dramatically increase productivity of ecosystem services that benefit from open rivers with a minimal impact on dam-related services. Further benefits are made possible for all ecosystem services by considering decision alternatives related to dam operations and physical modifications. Our method is helpful for identifying efficient decisions, but a deep and mutual understanding of stakeholder preferences is required to find a true solution. We outline how to interpret these preferences in our framework based on participatory methods used in stakeholder workshops. |
|
9:00 AM |
Daniel Ames, Brigham Young University Law School 9:00 AM - 10:40 AM In view of the ubiquitous mobile-app concept that has taken hold over the past decade, whereby distinct, single purpose, modular applications are developed and deployed in a shared user interface (i.e. the phone in your pocket), we have created open source cyberinfrastructure that mimics this paradigm for developing and deploying environmental web applications using open source tools and cloud computing services. This cyberinfrastructure integrates HydroShare for cloud-based data storage and app cataloging, together with Tethys Platform for Python/Django based app development. HydroShare is an open source web-based data management system for climate and water data that is includes a web-services application programmer interface (API) to allow third party programmers to access and use its data resources. We have created a metadata management structure within HydroShare for cataloging, discovering, and sharing web apps. Tethys Platform is an open source software package based on the Django framework, Python programming language, Geoserver, PostgreSQL, OpenLayers and other open source technologies. The Tethys software development kit allows users to create web apps that are presented in a common portal for visualizing, analyzing and modelling environmental data. We will introduce this new cyberinfrastructure through a combination of architecture design and demonstration, and will provide attendees the essential concepts for building their own web apps using these tools. |
|
9:00 AM |
Jack R. Carlson, Colorado State University - Fort Collins 9:00 AM - 10:20 AM During the past two decades, the Object Modeling System (OMS) framework evolved from a tool for building, testing, and validating environmental models in a consistent, non-invasive manner as an assembly of science components to include the Cloud Services Integration Platform (CSIP) for their deployment and integration as web services with business systems of public and private organizations, as well as university research programs. Currently OMS/CSIP repositories contain 292 web services organized in 31 service layers for (1) hydrology and water resources management, (2) erosion and sediment transport, (3) conservation resource management, (4) data access, retrieval, and management, (5) geospatial and statistical analysis, and (6) research support. These services and supporting data stores have been deployed and integrated with research, pilot, and production systems and applications of several organizations. We currently provide tier 1, 2, or 3 support to about 2 million service requests annually through the OMS/CSIP lifecycle: development, testing, validation, release, deployment, and production. Working with many organizations having unique requirements presents several challenges to meeting desired levels of customer satisfaction, forcing an emphasis on process improvement and operational efficiency. Using our experience supporting Field to Market – The Alliance for Sustainable Agriculture, the USDA Conservation Delivery Streamlining Initiative (CDSI), and other user communities, we analyze and describe steps taken to meet customer expectations, including release management, continuous integration, capacity management and hosting, access control, privacy protection, system/business activity monitoring, archiving, data stewardship, and documentation. |
|
9:00 AM |
Nathan Lighthart, Colorado State University 9:00 AM - 10:20 AM Distributing models to various users can be difficult and prone to error, and therefore may negatively reflect on the program. Model distribution typically involves users downloading and installing the model following setup instructions. However, if this step is not handled correctly, the model is not able to be used correctly. Deploying a model through a web interface allows the user to focus on running the model rather than ensuring the model is setup correctly. The eRAMS/CSIP platform is designed to provide visual tools for models to be parameterized and connected to input data (eRAMS), and run using a remote web service (CSIP) as Model as a Service (MaaS). By using MaaS, the model and the user’s data are accessible from any device. Further, models can take a long time to run, especially if calibrating. By executing the model remotely in asynchronous mode, the user can shut down their local machine without terminating the model run. Integration of the Agricultural Ecosystems Services (AgES) watershed model into the second revision of the eRAMS/CSIP platform will be described and demonstrated. Thus, the AgES watershed model is publicly available as a CSIP MaaS, which will be linked with other applications in eRAMS, including watershed delineation into interconnected polygons or hydrological response units (HRUs) and automated generation of crop rotations and tillage operations in each HRU using LAMPS (Landuse and Agricultural Management Practices web-Service). |
|
9:00 AM |
The Evolution of an Open, Interdisciplinary Earth System Modeling Framework Cecelia DeLuca 9:00 AM - 10:20 AM The Earth System Modeling Framework (ESMF) is open source software for building and coupling model components. ESMF was created by a consortium of U.S. federal agencies to support the transfer of knowledge among modeling centers and universities. It has grown into an established national resource, used in various forms by modelers at NASA centers, the National Weather Service, the National Center for Atmospheric Research, the Navy, and thousands of smaller groups and individuals. In the decade since it began, the challenges of community development and deployment have combined technical, scientific, and social aspects. In this talk, we examine the progression of these challenges, and look to the future. Current ESMF focus areas are examined, including the development of an interdisciplinary, unified forecast system for the National Weather Service that includes multiple "apps" for different types of prediction; automated resource mapping to address the growing complexity of computing architectures and coupled models; and the "post-interoperabilty" challenge of distributed component code management. |
|
10:40 AM |
Mohamed M. Morsy, University of Virginia 10:40 AM - 12:00 PM Coastal areas face significant challenges due to climate change. One of the primary challenges is the increasing risk of flooding that can cause severe damage and threaten lives. As such, the ability to accurately forecast flooding events and disseminate alerts is increasingly important. The National Water Model (NWM) has made large strides in providing flood forecasting information on a large scale. However, the coarse resolution of the NWM may not be sufficient for low relief coastal terrains. Rather, 2D hydrodynamic models are often more suitable for flood forecasting in coastal areas where low relief terrain is common. However, the computational expense of these models can pose a barrier to their implementation. This work focuses on the design of a cloud-based, real-time modeling system for a 2D hydrodynamic model coupled with the NWM to support decision makers in assessing flood risk in coastal areas. A prototype has been created using Google Cloud Platform (GCP) including cloud-based execution for the 2D hydrodynamic model with high spatial resolution input data, utilization of GPUs for model execution speed-up, a relational database for storing the model output, and a web front-end for dissemination of results and model initiation. The system is designed to run automatically if an extreme weather event is forecasted and produce near real-time results using boundary conditions automatically obtained and prepared from the NWM. |
|
10:40 AM |
A General Approach for Enabling Cloud-based Hydrologic Modeling using Jupyter Notebooks Anthony M. Castronova, Consortium of Universities for the Advancement of Hydrologic Science, Inc 10:40 AM - 12:00 PM Continued investment and development of cyberinfrastructure (CI) for water science research is transforming the way future scientists approach large collaborative studies. Among the many challenges, that we as a community need to address, are integrating existing CI to support reproducible science, enabling open collaboration across traditional domain and institutional boundaries, and extending the lifecycle of data beyond the scope of a single project. One emerging solution for addressing these challenges is HydroShare JupyterHub which is an open-source, cloud-based, platform that combines the data archival and discovery features of HydroShare with the expressive, metadata-rich, and self descriptive nature of Jupyter notebooks. This approach offers researchers a mechanism for designing, executing, and disseminating toolchains with supporting data and documentation. The goals of this work are to establish a free and open source platform for domain scientists to (1) conduct data intensive and computationally intensive collaborative research, (2) utilize high performance libraries, models, and routines within a pre-configured cloud environment, and (3) enable dissemination of research products. This presentation will discuss our approach for hydrologic model simulation, sensitivity analysis, and optimization applications in this platform by establishing a generic CI pattern that can be adopted to support research, classroom, and workshop activities. |
|
10:40 AM |
A service integration platform for geo-simulation in the distributed network environment Chaoran Shen, Nanjing Normal University 10:40 AM - 12:00 PM Geographic modeling and simulation is an important method to resolve realistic geographic problems, and can be used to reproduce the past, predict the future, and explore the geographic laws. When dealing with comprehensive geographical problems, one single model can not meet the requirements of complex simulation, and thus model integration with existing model resource is required. Although the existing geo-simulation platform has gone through several stages of development, it still faces many problems, such as how to prepare data for models in the network environment, how to establish the logical relationships between different services, and how to control the integrated simulation in a convenient and collaborative way. This paper proposed a geo-model service integration platform in the distributed network environment. Model resources, model-related data resources and data processing methods can be encapsulated as standardized service, which could be managed by servers. When facing with realistic geographic problems, through conceptual modelling and logical modelling, the comprehensive problem could be solved collaboratively by the simulation of integrated geo-model services. At last, an example of extract the stream network through models of TAU DEM is mentioned to test the practicability of the proposed platform, the results show that the service integration platform can conduct geographic simulation and solve geo-problems. |
|
10:40 AM |
Holm Kipka, Colorado State University 10:40 AM - 12:00 PM Delineating a watershed area into discrete areas (e.g. Hydrological Response Units HRUs) with parameter attribute tables is essential to generate input parameter sets for distributed hydrological models. This presentation will introduce a generic methodology of such a delineation process and services workflow for the Agricultural Ecosystems Services (AgES) watershed model. Here, Digital Elevation Model raster data are processed for the topology of flow paths, then combined with raster layer of land use, soils, and hydrogeology to generate HRU patterns for a watershed area. Based on the pattern of HRUs, the web service analyzes a topological routing scheme that allows multiple-flow directions and interactions between neighboring HRUs. The resulting HRU information is used to automatically generate input files for the AgES distributed watershed model. The tool is open source, available to the scientific community, and implemented using Catena, an emerging platform based on the eRAMS and CSIP frameworks providing scalable geospatial analyses, collaboration, and model service capabilities. Seamless integration of (1) generating spatial model parameters, (2) connecting input data, (3) executing the model, and (4) model result analyses will be presented with a strong focus on the workflow aspect of Catena. The HRU delineation tool can be adapted to other models such as the Soil and Water Assessment Tool (SWAT+). |
|
10:40 AM |
Integrating socio-environmental models: vegetation, agriculture, and landform dynamics Miguel Acevedo, University of North Texas 10:40 AM - 12:00 PM How did landscapes evolve as agriculture emerged thousands of year ago? How do we ensure sustainable food production and still maintain environmental quality? Integrated socio-environmental models help provide answers to these two seemingly distant, yet related questions. For practical purposes, in this paper we use the terms socio-ecological and socio-environmental systems, with acronym SES as synonyms. The Mediterranean Landscape Dynamics (MedLand) project aims to develop experimental SES models made possible by recent advances in computation while exercising interdisciplinary collaboration. In this paper, we exemplify one aspect of this integration by discussing the development of a vegetation model, which at the outset provides the future links to agent-based models of societal dynamics and process-based models of landform evolution. While designing a vegetation model specific to the needs of the MedLand SES workbench, we preserve those aspects of vegetation dynamics that yield a generic model applicable to other systems. We model vegetation using an individual-based (or gap-model) approach with detailed biological interaction of plants with fire. For this purpose, we use components of existing models of Mediterranean vegetation dynamics. As part of the integration challenge, we discuss spatial and temporal scales, resolution, and future prospects for integration analysis based on sensitivity of integrated model to coupling parameters. |
|
10:40 AM |
Jack R. Carlson, Colorado State University - Fort Collins 10:40 AM - 12:00 PM The Stewardship Tool for Environmental Performance (STEP) is an expert system for field conservationists to evaluate water quality benefits of management techniques and conservation practices applied to agricultural land. The tool computes nutrient leaching and runoff potentials, sediment runoff potentials, pesticide leaching and runoff potentials, as well as pesticide hazard ratings for farm fields based on local conditions. From this STEP computes minimum threshold levels reflecting the level of treatment needed to mitigate the loss potentials and hazards. The tool then provides a process for applying mitigating technique and practice scores to meet or exceed thresholds. The nutrient component of STEP results from analysis of the millions of APEX model simulations used in the USDA Conservation Effects Assessment Program (CEAP) completed for the major river basins of the United States since 2002. CEAP estimated the environmental benefits of conservation practices at the river basin level and in 2011 incorporated the knowledge into STEP. The pesticide component of STEP results from more than two decades using the stand-alone Pesticide Screening Tool application, which has roots in the CREAMS and GLEAMS models. As part of the effort to integrate resource assessment at the farm field level, we describe a suite of 22 microservices supporting the STEP workflow. We make these services available through the OMS/CSIP continuous integration process. |
|
10:40 AM |
Luis Garnica Chavira, Git Gud Consulting SAS 10:40 AM - 12:00 PM The wide variety in descriptions, implementations, and accessibility of scientific models poses a huge challenge for model interoperability. Model interoperability is key in the automation of tasks including model integration, seamless access to distributed models, data reuse and repurpose. Current approaches for model interoperability include the creation of generic standards and vocabularies to describe models, their inputs and outputs. These domain-agnostic standards often do not provide the fine-grained level required to describe a specific domain or task, and extending such standards requires a considerable amount of effort and time that is deviated from the purpose of producing scientific breakthrough and results. This paper presents a semi-structured, knowledge-based framework implemented with a service-driven architecture: The Sustainable Water through Integrated Modelling Framework (SWIM). SWIM is part of an ongoing effort to expose water sustainability models on the Web with the goal of enabling stakeholder engagement and participatory modelling. SWIM is a science-driven platform, leveraged by the technology advances on service-oriented architectures (SOA), schemaless database managers (NoSQL) and widely used Web-based frontend frameworks. The SWIM semi-structured knowledge model is flexible enough to adapt on-the-go as the underlying water sustainability models grow in complexity. SWIM fosters the sharing and reuse of data and models generated in the system by providing the descriptions of models, inputs, and outputs of each run using relevant metadata mapped to widely-used standards with JSON-LD, a JSON extension for linked data. |
|
10:40 AM |
Stephen Knox, The University of Manchester 10:40 AM - 12:00 PM Developing, sharing and using models to address socio-environmental problems requires both a common vocabulary and agreement on the scope of the modelled domain. One approach to this is the use of a single, shared conceptual model and a centralised data store where the inputs and outputs of each submodel are stored. For network-based models, the Pynsim python library facilitates this structure by allowing modellers to build a shared network structure using object-oriented design, and allowing submodels models to be ‘plugged-in’ to a Pynsim simulation. Building a shared conceptual model can be difficult if collaborators work remotely or some collaborators are inexperienced in software architecture design. By combining Pynsim with a web-based collaborative tool, some of these difficulties can be addressed. The Hydra Platform is a web-based data management system for network structures and data. Using templates, a shared structure can be developed where all the nodes, links, institutions (groupings of nodes & links) and their associated attributes can be defined. Using this template, a network topology can then be defined and its attributes parameterised. This network acts as the common conceptualisation and as the storage facility for input and output data. Using the a web interface, the template and network can be managed visually and shared visually.amongst users, allowing all users to have a visual reference to the shared conceptualisation. Using web requests, a client can extract the network from Hydra Platform and create a Pynsim network, thereby creating an input for the shared simulation. Once complete, results are pushed back to Hydra Platform for analysis either through the web UI’s built-in analysis tools or for download by the collaborators. |
|
2:00 PM |
A Study on Data Processing Services for Operation of Geo-Analysis Models in the Open Web Environment Jin Wang 2:00 PM - 3:20 PM With the development of network technology, the study of integrated modeling frameworks based on web services is becoming a key topic to contribute to solving complex geographic problems. To date, large numbers of geo-analysis models and massive data resources are available in the open web environment. Accessing, acquiring or invoking individual resources transparently is relatively straightforward; however, combining these models and data resources together for comprehensive simulations still remains challenging due to their heterogeneity and diversity. Data resources are the driving force of model execution, and they can serve as an intermediate linkage medium for model integration. However, in most cases, the data resources cannot be used directly to drive or link models. To enable the convenient coupling of models and data resources through the web, thus reducing the difficulty of preparing data and avoiding repetitive data processing work, data processing services that can prepare and process data are urgently needed in the web environment. Based on the proposed Universal Data eXchange (UDX) model, three types of data processing methods that can realize mapping, refactoring, and visualization were designed in this research. These methods can be published as data processing services to facilitate the operation of geo-analysis models through the web environment. The applicability of the proposed data processing services is examined using two cases: data preparation using processing services for the WATLAC model is designed in the first case, and the application of the processing service in model integration through the web is designed in the second case. The results demonstrate that the proposed processing services can bridge the gap between geo-analysis models and data resources hosted on networks. |
|
2:00 PM |
Geographic Process Modeling Based on Geographic Ontology Yuwei Cao, Nanjing Normal University 2:00 PM - 3:20 PM In geo-information science, which has traditionally focused on representations of spatial and temporal information, the representation of geographic processes like soil erosion are becoming more important. Exploring an appropriate method to express a geo-process is significant in revealing its dynamic evolution and underlying mechanisms. This research proposes a process-centric ontology model. It describes geographical environment through three aspects: geographic scene, geographic process and geographic feature. Geographic scene is a unified expression of environment that considers the integrity of geo-processes as well as the spatial temporal pattern of geo-features. Geographic process defines the existing actions of geo-features, and represents spatial, temporal and semantic changes. Geographic feature is the smallest unit of a geographic object, which contains basic geographic information and the affiliation between geo-process and geo-features. The above three aspects are represented through the proposal of a framework and the construction of ten sub-ontologies. These include Feature Ontology, Scene Ontology, Process Ontology, Space Ontology, Time Ontology, Spatial Relation Ontology, Time Relation Ontology, Representation Ontology, Substance Ontology and Operator Ontology. An instance for the soil erosion process is then selected to demonstrate the practicability of this framework. The entire process is separated into three sub-processes (soil detachment, soil transport and soil deposition), and each sub-process is described by when and where the process happened, identifying which features were present and how they reacted (interaction between features,processes,scenes), and what kind of changes were present in the geo-scene. Furthermore, different relationships between features, scenes and processes are defined to explain how and why soil erosion occurred. This proposed approach can reveal the underlying mechanism of geo-scenes, explore the occurrence and causes of geo-processes, and support the complex representation of geo-features. |
|
2:00 PM |
Baojia Zhang, University of Washington Tacoma 2:00 PM - 3:20 PM Recently serverless computing platforms have emerged that provide automatic web service hosting in the cloud. These platforms are promoted for their ability to host “micro” services to end users while seamlessly integrating key features including 24/7 high availability, fault tolerance, and automatic scaling of resources to meet user demand. Serverless Computing environments abstract the majority of infrastructure management tasks including VM/container creation, and load balancer configuration. A key benefit of serverless computing is FREE access to cloud computing resources. Many platforms provide free access for up to 1,000,000 service requests/month with 1 GB of memory for 400,000 seconds/month. Additionally, serverless platforms support programming languages common for modeler developers including Java, Python, C#, and Javascript. We present results from a proof-of-concept deployment of the Precipitation-Runoff Modeling System (PRMS) a deterministic, distributed-parameter model developed to evaluate the impact of various combinations of precipitation, climate, and land use on stream flow, sediment yields, and general basin hydrology (Leavesley et al., 1983). We deployed a web-services based implementation of PRMS implemented using the Cloud Services Integration Platform (CSIP) and the Object Modeling System (OMS) 3.0 component-based modeling framework to the Amazon AWS Lambda serverless computing platform. PRMS consists of approximately ~11,000 lines of code and easily fits within the 256 MB maximum code size constraint of AWS Lambda. We compared our serverless deployment to a traditional cloud based Amazon EC2 VM-based deployment. We contrast average model execution time, service throughput (requests/minute), as well as the cloud hosting costs of PRMS using both cloud platforms. |
|
2:00 PM |
Microservices for the Stream Visual Assessment Protocol (SVAP) Jack R. Carlson, Colorado State University - Fort Collins 2:00 PM - 3:20 PM The Stream Visual Assessment Protocol (SVAP) provides an initial assessment of the overall condition of wade-able streams, their riparian zones, and instream habitats. Field conservationists use the tool when providing technical assistance to land owners to improve stream conditions, sustainable use, and value of their property. SVAP does not require extensive training in biology, geomorphology, or hydrology, and represents a first step towards more detailed analysis and recommendations as needed. The protocol was developed in 1999 by the USDA Natural Resources Conservation Service (NRCS) Aquatic Assessment Workgroup, following two years of field study and validation involving 182 stream reaches in 9 states across the country. Following a decade of use, SVAP was updated in 2009 to increase sensitivity to resource conditions at the state and regional levels. To this point, SVAP has been applied as a mostly manual process, completing individual worksheets guided by a field manual, persisted as spreadsheets, PDF files, or other documents, in a file system, or more recently a document management system. However, completing the worksheet does not take advantage of on-line data sources nor meet priorities for integrating assessment of resource concerns on farms and ranches. We describe a suite of 14 SVAP microservices and associated data tables supporting web application data entry and editing, managing reference streams, and computing assessment scores. We make these services available through our OMS/CSIP continuous integration process. |
|
2:00 PM |
Young Don Choi, University of Virginia 2:00 PM - 3:20 PM The Structure for Unifying Multiple Modeling Alternatives (SUMMA) is a hydrologic modeling framework that allows hydrologists to systematically test alternative model conceptualizations. The objective of this project is to create a Python library for wrapping the SUMMA modeling framework called pySUMMA. Using this library, hydrologists can create Python scripts that document the alternative model conceptualizations tested within different experiments. To this end, pySUMMA provides an object-oriented means for updating SUMMA model configurations, executing SUMMA model runs, and visualizing SUMMA model outputs. This work is part of the HydroShare web-based hydrologic information system operated by the Consortium of Universities for the Advancement of Hydrologic Science Inc. (CUAHSI) that seeks to make hydrologic data and models discoverable and shareable online. Creating pySUMMA is a first step toward the longer-term goal of creating an interactive SUMMA-based modeling system by combining HydroShare.org, JupyterHub, and High Performance Computing (HPC) resources. In the current version of HydroShare, different data and model resources can be uploaded, shared, and published. This current development will result in a tighter integration between the SUMMA modeling process and HydroShare.org with the goal of making hydrologic models more open, reusable, and reproducible. Ultimately, SUMMA serves as a use case for modeling in HydroShare that advances a general approach for leveraging JupyterHub and HPC that can be repeated for other modeling systems. |
|
2:00 PM |
Study on a Service Container for Sharing and Reusing Geo-analysis Models in the Web Environment Fengyuan Zhang, Nanjing Normal University 2:00 PM - 3:20 PM Geo-analysis models are the abstraction and expression of geographic phenomena and processes. To date, many geo-analysis models have been created in different domains, and collaborative modeling and simulation using models in the open Web environment are becoming popular for geographic research. Service-oriented sharing and reusing of geo-analysis models through the Web are an important foundation of this work. However, model service-related research faces several challenges, especially an inability to reuse heterogeneous resources of geo-simulation. By studying the generation, management, and publication of model services, this article aims to design and develop a service container for geo-analysis models that acts as a service loader and manager to publish these models as services to users. The service container can accomplish such functions as standardizing model preparation, managing model services, and invoking interactive model services, thus bridging the gap between geo-simulation resource providers and service users. This article uses the Soil and Water Assessment Tool model and the Unstructured Grid Finite Volume Community Ocean Model as case studies to show that the proposed service container can enable the convenient sharing of model resources, and can further contribute to collaborative modeling and simulation across a network. |
|
3:40 PM |
Lauren Hay, USGS 3:40 PM - 5:00 PM The traditional approach to hydrologic model calibration and evaluation—comparing observed and simulated streamflow—is not sufficient. Intermediate process variables computed by the model could be characterized by parameter values that do not necessarily replicate those hydrological processes in the physical system. To better replicate these intermediate processes, in addition to streamflow, intermediate process variables can be examined when there is an associated “observed” variable that can be used as a baseline dataset. This study presents a CONUS-scale parameter estimation technique for the USGS National Hydrologic Model application of the Precipitation-Runoff Modeling System (NHM-PRMS) using five ‘baseline’ data sets (runoff, actual evapotranspiration, recharge, soil moisture, and snow covered area). These baseline datasets, with error bounds, were derived from multiple sources for each of the NHM-PRMS’s 109,951 hydrologic response units (HRUs) on time scales from annual to daily. Sensitivity analysis was used to identify the (1) calibration parameters and (2) relative sensitivity of each baseline dataset to the calibration parameters, for each HRU. A multiple-objective, step-wise, automated calibration procedure was used to identify the ‘optimal’ set of parameter values for each HRU. Using a variety of baseline data sets for calibration of the intermediate process variables alleviates the equifinality problem and “getting the right answer for the wrong reason.” Through a community effort we should strive to improve our understanding of this baseline information, making it available to the modeling communities to help calibrate and evaluate hydrologic models using more than streamflow. |
|
3:40 PM |
Comparative assessment of carbon footprint of four dairy farms in Australia using DairyGHG Model Veerasamy Sejian Dr, ICAR-National Institute of Animal Nutrition and Physiology, Bangalore, India 3:40 PM - 5:00 PM DairyGHG model is a cost effective and efficient method of estimating greenhouse gas (GHG) emissions from dairy farms and analyzing how management strategies affect these emissions. Therefore, the DairyGHG model was used in this study to predict the GHG emission and assess the carbon footprints of four different dairy farms at Australia. The study was conducted in four different dairy farms distributed in different locality of Queensland, Australia. The details of the farms are: Farm 1 (220 cows; Jersey), Farm 2 (460 cows; Holstein Friesian), Farm 3 (850 cows; Holstein Friesian) and Farm 4 (434 cows; Holstein Friesian). In all the four farms the cows were fed corn silage, grain and the animals had access to grazing. The animal emission contribution to carbon footprints in Farm 1, Farm 2, Farm 3 and Farm 4 were 54.2%, 60.0%, 59.6% and 38.6% respectively. Likewise, the manure emission contribution to carbon footprints in Farm 1, Farm 2, Farm 3 and Farm 4 were 30.6%, 29.0%, 29.0% and 58.3% respectively. On the basis of per kg of energy corrected milk the amount of GHG produced in Farm 1, Farm 2, Farm 3 and Farm 4 are 0.39 kg CO2e, 0.64 kg CO2e, 0.54 kg CO2e and 1.35 kg CO2e respectively. On comparative basis, Farm 4 contributed substantially higher quantity of GHG emission while the least contribution came from Farm 1. Thus, it can be concluded from the study that Jersey breed contributes comparatively less dairy associated GHG emission as compared to Holstein Friesian breed. |
|
3:40 PM |
CONUS-scale Stream Temperature Modeling utilizing the USGS National Hydrologic Model Steven Markstrom, USGS 3:40 PM - 5:00 PM Stream temperature is fundamentally important in the structure and function of freshwater riverine ecosystems. The interactions of flora and fauna with chemical constituents, dissolved oxygen and other water quality factors are influenced by stream temperature. Computer models can be used to simulate stream temperature at stream segment resolution (e.g., a network with stream segments between 1 and 100 kilometers long), which in turn can facilitate decision making by ecologists and resource managers. A daily mean stream temperature modeling application, based on the hydrologic simulations of the U.S. Geological Survey’s National Hydrologic Model, has been developed. Preliminary results from this application are presented. |
|
3:40 PM |
Creation of a data model to retrieve constrained watershed boundaries. Scott Haag, Drexel University 3:40 PM - 5:00 PM When using flow direction grids to describe impacts to streams and watershed pour points two spatial scales of interest are commonly applied, the entire watershed and the local stream catchment. In this presentation we describe an alternative approach referred to as the constrained watershed boundary. Specifically, we define a constrained watershed as a polygon containing all the flow direction grid cells with a surface flow distance less than a prescribed threshold. The proposed algorithm builds upon the HSM algorithm, as described by Haag and Shokoufandeh 2017, and augments the data structure with a flow distance grid calculated directly from the original FDG. The new algorithm allows the rapid retrieval and visualization of constrained watersheds boundaries based on a user defined distance threshold(s). The merged algorithm is a variant of the HSM algorithm and therefore it will retrieve watershed boundaries more efficiently then grid searching techniques. Empirical tests for the Delaware River Watershed Retrieval problem indicate a reduction from ~35 million read operations to ~45 thousand using the HSM approach Haag and Shokoufandeh 2017. This algorithm was implemented within a restful-API framework for the continental United States using the 30 m FDG from the NHDPlus v2 (but it can be implemented using any input D8 flow direction grid e.g., Hydrosheds). Results will be demonstrated during the presentation using a standard web browser. |
|
3:40 PM |
Cyberinfrastructure for Enhancing Interdisciplinary Engagement in Coastal Risk Management Research Ayse Karanci, North Carolina State University / Moffatt and Nichol 3:40 PM - 5:00 PM Today, tackling critical questions often requires the collaboration of researchers from different disciplines or institutions. Coastal hazards research is necessarily interdisciplinary and multi-methodological and often requires a team of researchers. This paper introduces an interdisciplinary coastal hazard risk model which combines high resolution geospatial data, storm impact forecasts, and an agent-based model in the analysis and describes the model’s application in a data science cyberinfrastructure. The implementation of the coastal hazard risk model in the cyberinfrastructure involved partitioning the model into three parts: 1) Geospatial analyses: analyzes housing developments and the coast to create a household database. (2) Storm evaluation: Storm simulation database containing a suite of simulation results with varying beach and storm conditions. (3) Coastal town Agent Based Model (ABM): ABM framework that simulates changes to the housing development and coastal landforms due to (i) feedback within the human and natural systems, and (ii) external forcings, storms, and sea level rise. Each step was then packaged into self-containing dockers, enabling researchers from different disciplines to modify the existing tools and test new theories without considering the compatibility with the underlying technologies in the other dockers. Furthermore, the researchers can plug into their separate dockers into the workflow, enabling test of new or modified apps without disrupting the usage of other researchers. |
|
3:40 PM |
Development of Watershed Delineation Tool Using Open Source Software Technologies Jae Sung Kim, Colorado State University 3:40 PM - 5:00 PM Watershed delineation tool was developed in the Environmental Resource Assessment & Management System (eRAMS) platform using open source software technologies. This tool processes Digital Elevation Model (DEM) and DEM-driven raster data to delineate watersheds using two open source software technologies, TauDEM and GDAL. Users can upload their own DEM or create one using NHDPlusV2, which is provided by default in eRAMS watershed delineation tool. During processing of the DEM, it was found that the flat area was excluded in the stream raster pixels calculated from TauDEM ‘Peuker Douglas’ module; therefore this module was modified to include stream pixels on flat areas, which had a larger flow accumulation value than the threshold. This threshold was decided as the average of flow accumulation values on Peuker Douglas Stream pixels. To provide users flexibility, three options, Advanced, Basic, and Watershed Extraction, were developed. The Advanced option lets users run every module required for watershed delineation one by one. The basic option allows users to run every module automatically at once without configuring intermediate processes if DEM and outlet points are provided. Watershed extraction module uses NHDPlusV2 flow direction raster to extract watershed rapidly using the outlets and an area of interest. TauDEM modules were used in the Cloud Service Innovation Platform (CSIP), which were called by the eRAMS platform. The tool can be used any place in the continental US. |
|
3:40 PM |
Modeling dynamics of ecological systems with geospatial networks and agents Taylor Anderson 3:40 PM - 5:00 PM Landscape connectivity networks are composed of sets of nodes representing georeferenced habitat patches that link together based on the maximum dispersal distance of a species of interest. Graph theory is used to measure the structure and connectivity of resulting networks to inform dispersal patterns, identify key habitat patches, and assess how changes in structure disrupts dispersal patterns. Despite their potential, landscape connectivity networks are mostly static representations, formed using maximum dispersal distance only, and thus do not account for network structure as a function of variation in habitat patch attributes and complex spatio-temporal dispersal dynamics. The main objective of this research study is to develop a modelling approach that integrates networks and agent-based modelling (ABM) for the representation of a dynamic dispersal network of the ecological system, emerald ash borer (EAB) forest insect infestation. The approach develops a network agent-based model (N-ABM) that integrates an ABM and dynamic spatial networks to simulate spatio-temporal patterns of the EAB infestation at the individual scale and generates spatial network structures of EAB dispersal as the phenomenon infests ash trees. The model approach is programmed in Java and is implemented in Repast Simphony, a free and open source agent-based modelling and simulation platform, using geospatial datasets representing the location of ash trees suitable for EAB infestation across the eastern part of North America as a case study. The simulated spatial networks are characterized using graph theory measures, identifying important dispersal pathways and habitat patches that exacerbate EAB dispersal and quantifying the effectiveness of the removal of habitat patches in disrupting dispersal. |
|
Thursday, June 28th | ||
9:00 AM |
William Farmer, USGS 9:00 AM - 10:20 AM The United States Geological Survey (USGS) developed the National Hydrologic Model (NHM) to support coordinated, comprehensive, and consistent hydrologic model development and application within the conterminous United States (CONUS). The NHM application of the Precipitation-Runoff Modeling System (NHM-PRMS) was used to model 1,380 gaged watersheds across the CONUS to test the feasibility of improving streamflow simulations by linking statistically- and physically-based hydrologic models. Daily streamflow was simulated at each of the 1,380 gaged watersheds using a cross-validated implementation of pooled ordinary kriging. In this manner, the streamflow at each gage was simulated as if no at-site streamflow information were available. The objectives of this study were to 1) test the ability of the NHM-PRMS and ordinary kriging to simulate streamflow across the CONUS for select watersheds, 2) compare simulations of the NHM-PRMS calibrated using measured streamflow and the NHM-PRMS calibrated using ordinary kriging with measured streamflow, and 3) to determine the feasibility of using ordinary kriging in place of measured streamflow to calibrate the NHM-PRMS to provide streamflow simulations in ungaged basins. |
|
9:00 AM |
Regional Assessment of Temporal Changes in Flood Frequency and Magnitude Tyler Wible, Colorado State University 9:00 AM - 10:20 AM Temporal analysis of flood frequency and magnitude have been studied for their significance in hydraulic design. The Bulletin 17B Log-Pearson Type III flood frequency analysis, developed by the USGS, is one of the most common analysis frameworks and has even been incorporated in automated software packages (Flynn et al., 2006). However, the inherent assumption of stationarity in most of these analyses is no longer applicable under changing climate conditions. This study summarizes a reanalysis of USGS peak flow gage records in the United States over the last 60 years in an attempt to quantify the temporal trends in flood frequency. The Cloud Services Integration Platform (CSIP) model-as-a-service cloud-computing platform was used to perform the analysis using the Bulletin 17B flood frequency method for all stream monitoring stations across the Nation. Results indicated regional temporal trends and spatial patterns to changes in flood frequency across the continental U.S. |
|
9:00 AM |
Technological Innovations in Continental Scale Routing utilizing the USGS National Hydrologic Model Ashley E. Van Beusekom, USGS 9:00 AM - 10:20 AM The intricacies of the streamflow routing technique in hydrological modeling have the capability to alter the timing of the resulting streamflow more than any other single part of the modeling process. Many large-scale models use a relatively simple non-dynamical channel flow as a routing scheme, where storage of water in a reach is linearly related to inflow and outflow rates, in order to model streamflow delay and (flood) attenuation. Considerable and valuable gains in model performance have been identified in large floodplains by switching from non-dynamical to dynamical routing. In dynamical routing, streamflow is delayed and attenuated while changing the characteristics of the streamflow depth and velocity. This is done by approximating a wave speed for the streamflow with kinematic calculations on a particle. Kinematic wave tracking is more physically realistic but is also more computationally expensive than the non-dynamical routing. Assessment of dynamical routing on the continental-scale needs to take into account the increase of computational expense along with the limitations of data resolution for deriving parameter and calibration datasets. In a new continental-scale model utilizing the USGS National Hydrologic Model, we test the effect of dynamical routing on model performance. In addition, we explore the enhancement of continental-scale streamflow routing with gaining and losing streams, again investigating the trade-off between model physical realism, data availability, and computational expense. |
|
9:00 AM |
Jessica M. Driscoll, USGS 9:00 AM - 10:20 AM A comprehensive understanding of physical processes that affect streamflow is required to effectively manage water resources to meet present and future human and environmental needs. Water resources management from local to national scales can benefit from a consistent, process-based watershed modeling capability. The National Hydrologic Model (NHM), which was developed by the U.S. Geological Survey to support coordinated, comprehensive, and consistent hydrologic modeling at multiple scales for the conterminous United States, provides this essential capability. The NHM fills knowledge gaps in ungaged areas to disseminate nationally-consistent, locally informed, stakeholder relevant results. The NHM provides scientists, water resource managers, and the public knowledge to advance basic scientific inquiry, enable more informed and effective decision-making, and provide an educational resource to learn about all components of the water balance. In the future, as understanding of hydrologic processes allows for improved algorithms and data sets, the NHM will continue to evolve to better support the nation’s water-resources research, decision making, and education needs. |
|
10:40 AM |
Jacob LaFontaine, USGS 10:40 AM - 12:00 PM The U.S. Geological Survey (USGS) National Water Census is assessing the current state of water resources for the United States. The literature has consistently shown that various hydrologic models including distributed-parameter process-based, water balance, and statistical, have difficulty simulating hydrologic response in the semi-arid central U.S.; a north-south region from North Dakota to Texas. Inaccurate model simulations may result from inadequate or scarce data (e.g., climate and streamflow data), complex geology, undocumented anthropogenic impacts (water withdrawals or additions), limitations in model conceptualization, or limitations of model parameterization. One measure of hydrologic model performance is the ability to match observed streamflow volume and timing. Daily time step streamflow simulations have been developed using spatially-distributed, deterministic hydrologic models extracted from the USGS National Hydrologic Model parameterized for the Precipitation-Runoff Modeling System. Each hydrologic model uses four gridded climate datasets that have been developed for the conterminous United States using various distribution algorithms and spatial resolutions. To quantify the effects of climate inputs on model performance, this research focuses on (1) comparing the occurrence of precipitation for the four gridded climate datasets and (2) evaluating the effects of the choice of climate forcings on daily time step hydrologic simulation performance in the semi-arid central U.S. |
|
10:40 AM |
Global Streamflow Prediction and Dynamic HAND Flood Maps Corey Krewson, Brigham Young University 10:40 AM - 12:00 PM Streamflow prediction provides direct insight to water availability and related risks. Global models are important in regions of the world that lack this critical insight from the local models or historical records. Some of the challenges regarding global models are their accuracy at a relatively low resolution, big data management, communication, and local acceptance. Using the Global Flood Awareness System (GloFAS), and the Routing Application for Parallel Computation of Discharge (RAPID) we have developed a high-density streamflow prediction system for Africa, North America, South Asia, and South America. An all-around structure has been developed on the cloud to automatically compute, store, and communicate results. Other developments include a generic open-access web application where results can be visualized, the use of a REST API to access streamflow data programmatically, and tools that facilitate incorporation of forecasts into regional or local systems. State-of-the-art techniques have been applied using GIS tools to provide a streamflow animation service to better visualize how flowrates change over the forecasted time and exceed return periods. The REST API and flow forecasts are being used to develop other applications including dynamic flood maps in various parts of the world. These applications are valuable tools for agencies charged with disaster management and overall supervision of national water programs. |