Keywords

automatic semantic mediation; naming conventions; variable names; model metadata; lingua franca

Location

Session A1: Leveraging Cyberinfrastructure to Advance Scientific Productivity and Reproducibility in the Water Sciences

Start Date

16-6-2014 9:00 AM

End Date

16-6-2014 10:20 AM

Abstract

The CSDMS (Community Surface Dynamics Modeling System) modeling framework provides mechanisms that allow models and data sets from different contributors to be automatically coupled in a plug-and-play manner to create composite models. In developing this capability, a key challenge has been that of automatic semantic mediation, or name matching, because each model or data set (here called a resource) uses its own set of terms for input and output variable names. These names are often domain-specific or abbreviated. In order for the CSDMS framework to determine whether one model’s output variable is appropriate to be used as another model’s input variable, a standardized and precise description of each variable is required. If additional information (e.g. a variable’s units) is also provided in a standardized way, the framework can also automatically apply conversions (e.g. unit conversion) when needed so that coupled resources can share the numerical value of a variable. This paper’s purpose is to (1) define this semantic mediation problem in terms of design criteria for a desired solution, (2) compare alternate approaches to the problem and (3) propose a new set of naming conventions, the CSDMS Standard Names, that meet the stated design criteria. Although this work is ongoing, CSDMS Standard Names are currently used within the CSDMS framework and are proving to be an effective solution strategy for this problem.

COinS
 
Jun 16th, 9:00 AM Jun 16th, 10:20 AM

The CSDMS Standard Names: Cross-Domain Naming Conventions for Describing Process Models, Data Sets and Their Associated Variables

Session A1: Leveraging Cyberinfrastructure to Advance Scientific Productivity and Reproducibility in the Water Sciences

The CSDMS (Community Surface Dynamics Modeling System) modeling framework provides mechanisms that allow models and data sets from different contributors to be automatically coupled in a plug-and-play manner to create composite models. In developing this capability, a key challenge has been that of automatic semantic mediation, or name matching, because each model or data set (here called a resource) uses its own set of terms for input and output variable names. These names are often domain-specific or abbreviated. In order for the CSDMS framework to determine whether one model’s output variable is appropriate to be used as another model’s input variable, a standardized and precise description of each variable is required. If additional information (e.g. a variable’s units) is also provided in a standardized way, the framework can also automatically apply conversions (e.g. unit conversion) when needed so that coupled resources can share the numerical value of a variable. This paper’s purpose is to (1) define this semantic mediation problem in terms of design criteria for a desired solution, (2) compare alternate approaches to the problem and (3) propose a new set of naming conventions, the CSDMS Standard Names, that meet the stated design criteria. Although this work is ongoing, CSDMS Standard Names are currently used within the CSDMS framework and are proving to be an effective solution strategy for this problem.