Keywords
E3SM Land model, Exascale Computing, OpenACC, SUMMIT, Functional Unit Testing
Start Date
15-9-2020 3:40 PM
End Date
15-9-2020 4:00 PM
Abstract
The Energy Exascale Earth System Model (E3SM) is a computationally advanced coupled climate-energy model investigating the challenges posed by the interactions of weather-climate scale variability with energy and related sectors. E3SM contains a community land model for understanding how natural and human changes in terrestrial land surfaces will affect the climate. E3SM Land Model (ELM) consists of submodels related to land biogeophysics, the hydrologic cycle, biogeochemistry, human activities, and ecosystem dynamics. In this paper, we present our early experience in redesigning ELM for a pre-exascale computer, SUMMIT, at Oak Ridge National Laboratory in the USA. Considering the complexity of the ELM software system and technical readiness of several cutting-edge computing technologies, we start our software engineering effort with single-site ELM simulations within a functional unit testing platform. This effort provides a good understanding of data structure refactoring, data movement, and code porting between heterogeneous hardware, such as GPU/CPU and disk/non-volatile memory. We investigate new OpenACC features to expedite the data movement and code porting on a single SUMMIT node. Then we explore new ways to generate synthesized forcing datasets to test parallel ultra-scale ELM simulation over North America. Our early experiments show that the new OpenACC features (i.e., deepcopy and the subroutine directive) from PGI Fortran are robust to create dedicated data regions containing complex data structures. Also, one single NVIDIA V100 GPU unit can comfortably handle up to 1900 site simulations. Therefore, we can use around 1500 SUMMIT nodes to undertake a continental-scale development of driving datasets and offline land simulations at an ultra-scale (1km x 1km) resolution over North America.
Early experience in ultra-scale E3SM land model development on SUMMIT
The Energy Exascale Earth System Model (E3SM) is a computationally advanced coupled climate-energy model investigating the challenges posed by the interactions of weather-climate scale variability with energy and related sectors. E3SM contains a community land model for understanding how natural and human changes in terrestrial land surfaces will affect the climate. E3SM Land Model (ELM) consists of submodels related to land biogeophysics, the hydrologic cycle, biogeochemistry, human activities, and ecosystem dynamics. In this paper, we present our early experience in redesigning ELM for a pre-exascale computer, SUMMIT, at Oak Ridge National Laboratory in the USA. Considering the complexity of the ELM software system and technical readiness of several cutting-edge computing technologies, we start our software engineering effort with single-site ELM simulations within a functional unit testing platform. This effort provides a good understanding of data structure refactoring, data movement, and code porting between heterogeneous hardware, such as GPU/CPU and disk/non-volatile memory. We investigate new OpenACC features to expedite the data movement and code porting on a single SUMMIT node. Then we explore new ways to generate synthesized forcing datasets to test parallel ultra-scale ELM simulation over North America. Our early experiments show that the new OpenACC features (i.e., deepcopy and the subroutine directive) from PGI Fortran are robust to create dedicated data regions containing complex data structures. Also, one single NVIDIA V100 GPU unit can comfortably handle up to 1900 site simulations. Therefore, we can use around 1500 SUMMIT nodes to undertake a continental-scale development of driving datasets and offline land simulations at an ultra-scale (1km x 1km) resolution over North America.
Stream and Session
false