Keywords
workflows, provenance, semantic workflows, intelligent workflow systems
Location
Session A1: Leveraging Cyberinfrastructure to Advance Scientific Productivity and Reproducibility in the Water Sciences
Start Date
16-6-2014 3:40 PM
End Date
16-6-2014 5:20 PM
Abstract
Workflows are increasingly used in science to manage complex computations and data processing at large scale. Intelligent workflow systems provide assistance in setting up parameters and data, validating workflows created by users, and automating the generation of workflows from high-level user guidance. These systems use semantic workflows that extend workflow representations with semantic constraints that express characteristics of the data and analytic models. Reasoning algorithms propagate these semantic constraints throughout the workflow structure, select executable components for underspecified steps, and suggest parameter values. Semantic workflows also enhance provenance records with abstract steps that reflect the overall data analysis method rather than just execution traces. The benefits of semantic workflows include: 1) improving the efficiency of scientists, 2) allowing inspectability and reproducibility, and 3) disseminating expertise to new researchers. Intelligent workflow systems are an instance of provenance-aware software, since they both use and generate provenance and metadata as the data is being processed. Provenance-aware software enhances scientific analysis by propagating upstream metadata and provenance to new data products. Through the use of provenance standards, such as the recent W3C PROV recommendation for provenance on the Web, provenance-aware software can significantly enhance scientific data analysis, publication, and reuse. New capabilities are enabled when provenance is brought to the forefront in the design of software systems for science.
Included in
Civil Engineering Commons, Data Storage Systems Commons, Environmental Engineering Commons, Other Civil and Environmental Engineering Commons
Intelligent Workflow Systems and Provenance-Aware Software
Session A1: Leveraging Cyberinfrastructure to Advance Scientific Productivity and Reproducibility in the Water Sciences
Workflows are increasingly used in science to manage complex computations and data processing at large scale. Intelligent workflow systems provide assistance in setting up parameters and data, validating workflows created by users, and automating the generation of workflows from high-level user guidance. These systems use semantic workflows that extend workflow representations with semantic constraints that express characteristics of the data and analytic models. Reasoning algorithms propagate these semantic constraints throughout the workflow structure, select executable components for underspecified steps, and suggest parameter values. Semantic workflows also enhance provenance records with abstract steps that reflect the overall data analysis method rather than just execution traces. The benefits of semantic workflows include: 1) improving the efficiency of scientists, 2) allowing inspectability and reproducibility, and 3) disseminating expertise to new researchers. Intelligent workflow systems are an instance of provenance-aware software, since they both use and generate provenance and metadata as the data is being processed. Provenance-aware software enhances scientific analysis by propagating upstream metadata and provenance to new data products. Through the use of provenance standards, such as the recent W3C PROV recommendation for provenance on the Web, provenance-aware software can significantly enhance scientific data analysis, publication, and reuse. New capabilities are enabled when provenance is brought to the forefront in the design of software systems for science.