Keywords

workflows, provenance, semantic workflows, intelligent workflow systems

Location

Session A1: Leveraging Cyberinfrastructure to Advance Scientific Productivity and Reproducibility in the Water Sciences

Start Date

16-6-2014 3:40 PM

End Date

16-6-2014 5:20 PM

Abstract

Workflows are increasingly used in science to manage complex computations and data processing at large scale. Intelligent workflow systems provide assistance in setting up parameters and data, validating workflows created by users, and automating the generation of workflows from high-level user guidance. These systems use semantic workflows that extend workflow representations with semantic constraints that express characteristics of the data and analytic models. Reasoning algorithms propagate these semantic constraints throughout the workflow structure, select executable components for underspecified steps, and suggest parameter values. Semantic workflows also enhance provenance records with abstract steps that reflect the overall data analysis method rather than just execution traces. The benefits of semantic workflows include: 1) improving the efficiency of scientists, 2) allowing inspectability and reproducibility, and 3) disseminating expertise to new researchers. Intelligent workflow systems are an instance of provenance-aware software, since they both use and generate provenance and metadata as the data is being processed. Provenance-aware software enhances scientific analysis by propagating upstream metadata and provenance to new data products. Through the use of provenance standards, such as the recent W3C PROV recommendation for provenance on the Web, provenance-aware software can significantly enhance scientific data analysis, publication, and reuse. New capabilities are enabled when provenance is brought to the forefront in the design of software systems for science.

 
Jun 16th, 3:40 PM Jun 16th, 5:20 PM

Intelligent Workflow Systems and Provenance-Aware Software

Session A1: Leveraging Cyberinfrastructure to Advance Scientific Productivity and Reproducibility in the Water Sciences

Workflows are increasingly used in science to manage complex computations and data processing at large scale. Intelligent workflow systems provide assistance in setting up parameters and data, validating workflows created by users, and automating the generation of workflows from high-level user guidance. These systems use semantic workflows that extend workflow representations with semantic constraints that express characteristics of the data and analytic models. Reasoning algorithms propagate these semantic constraints throughout the workflow structure, select executable components for underspecified steps, and suggest parameter values. Semantic workflows also enhance provenance records with abstract steps that reflect the overall data analysis method rather than just execution traces. The benefits of semantic workflows include: 1) improving the efficiency of scientists, 2) allowing inspectability and reproducibility, and 3) disseminating expertise to new researchers. Intelligent workflow systems are an instance of provenance-aware software, since they both use and generate provenance and metadata as the data is being processed. Provenance-aware software enhances scientific analysis by propagating upstream metadata and provenance to new data products. Through the use of provenance standards, such as the recent W3C PROV recommendation for provenance on the Web, provenance-aware software can significantly enhance scientific data analysis, publication, and reuse. New capabilities are enabled when provenance is brought to the forefront in the design of software systems for science.