Keywords
Model evaluation; validation; data splitting
Start Date
27-6-2018 9:00 AM
End Date
27-6-2018 10:20 AM
Abstract
When developing environmental models, it is generally considered good practice to conduct an independent assessment of model performance using validation data. This practice is also commonly used to perform comparative evaluations of the performance of different types of models (i.e. performance on the independent evaluation data is used to infer whether the performance of a particular model can be considered to be superior to that of another). Evaluation of model performance on an independent data set requires the available data to be split into model calibration and validation subsets. Consequently, the results of both model calibration and validation depend on which subset of the available data is used for the former and which is used for the latter. It follows that the method used to decide which data subset is used for calibration and validation can have a significant impact on the results – influencing both the nature of the model obtained and also the conclusions regarding model adequacy/performance arrived at through the out-of-sample assessment. In this study, we have systematically tested the impact of different methods of splitting the data into calibration and validation subsets on validation performance for data-driven hydrological models developed for 754 catchments in Australia and the USA. Results indicate that model validation error can vary by more than 100%, depending on how the available data are split into calibration and validation subsets, raising the issue of whether the practice of assessing the predictive validity of models increases or decreases the uncertainty associated with environmental model outputs.
Does predictive validation increase or decrease the uncertainty associated with environmental model outputs?
When developing environmental models, it is generally considered good practice to conduct an independent assessment of model performance using validation data. This practice is also commonly used to perform comparative evaluations of the performance of different types of models (i.e. performance on the independent evaluation data is used to infer whether the performance of a particular model can be considered to be superior to that of another). Evaluation of model performance on an independent data set requires the available data to be split into model calibration and validation subsets. Consequently, the results of both model calibration and validation depend on which subset of the available data is used for the former and which is used for the latter. It follows that the method used to decide which data subset is used for calibration and validation can have a significant impact on the results – influencing both the nature of the model obtained and also the conclusions regarding model adequacy/performance arrived at through the out-of-sample assessment. In this study, we have systematically tested the impact of different methods of splitting the data into calibration and validation subsets on validation performance for data-driven hydrological models developed for 754 catchments in Australia and the USA. Results indicate that model validation error can vary by more than 100%, depending on how the available data are split into calibration and validation subsets, raising the issue of whether the practice of assessing the predictive validity of models increases or decreases the uncertainty associated with environmental model outputs.
Stream and Session
F3: Modelling and Decision Making Under Uncertainty