Keywords
automatic semantic mediation; naming conventions; variable names; model metadata; lingua franca
Location
Session A1: Leveraging Cyberinfrastructure to Advance Scientific Productivity and Reproducibility in the Water Sciences
Start Date
16-6-2014 9:00 AM
End Date
16-6-2014 10:20 AM
Abstract
The CSDMS (Community Surface Dynamics Modeling System) modeling framework provides mechanisms that allow models and data sets from different contributors to be automatically coupled in a plug-and-play manner to create composite models. In developing this capability, a key challenge has been that of automatic semantic mediation, or name matching, because each model or data set (here called a resource) uses its own set of terms for input and output variable names. These names are often domain-specific or abbreviated. In order for the CSDMS framework to determine whether one model’s output variable is appropriate to be used as another model’s input variable, a standardized and precise description of each variable is required. If additional information (e.g. a variable’s units) is also provided in a standardized way, the framework can also automatically apply conversions (e.g. unit conversion) when needed so that coupled resources can share the numerical value of a variable. This paper’s purpose is to (1) define this semantic mediation problem in terms of design criteria for a desired solution, (2) compare alternate approaches to the problem and (3) propose a new set of naming conventions, the CSDMS Standard Names, that meet the stated design criteria. Although this work is ongoing, CSDMS Standard Names are currently used within the CSDMS framework and are proving to be an effective solution strategy for this problem.
Included in
Civil Engineering Commons, Data Storage Systems Commons, Environmental Engineering Commons, Other Civil and Environmental Engineering Commons
The CSDMS Standard Names: Cross-Domain Naming Conventions for Describing Process Models, Data Sets and Their Associated Variables
Session A1: Leveraging Cyberinfrastructure to Advance Scientific Productivity and Reproducibility in the Water Sciences
The CSDMS (Community Surface Dynamics Modeling System) modeling framework provides mechanisms that allow models and data sets from different contributors to be automatically coupled in a plug-and-play manner to create composite models. In developing this capability, a key challenge has been that of automatic semantic mediation, or name matching, because each model or data set (here called a resource) uses its own set of terms for input and output variable names. These names are often domain-specific or abbreviated. In order for the CSDMS framework to determine whether one model’s output variable is appropriate to be used as another model’s input variable, a standardized and precise description of each variable is required. If additional information (e.g. a variable’s units) is also provided in a standardized way, the framework can also automatically apply conversions (e.g. unit conversion) when needed so that coupled resources can share the numerical value of a variable. This paper’s purpose is to (1) define this semantic mediation problem in terms of design criteria for a desired solution, (2) compare alternate approaches to the problem and (3) propose a new set of naming conventions, the CSDMS Standard Names, that meet the stated design criteria. Although this work is ongoing, CSDMS Standard Names are currently used within the CSDMS framework and are proving to be an effective solution strategy for this problem.