Keywords
hydrology, physically based modeling, algorithmic information theory, model complexity.
Location
Session G1: Using Simulation Models to Improve Understanding of Environmental Systems
Start Date
16-6-2014 9:00 AM
End Date
16-6-2014 10:20 AM
Abstract
Prediction in environmental systems, such as hydrological streamflow prediction, is a challenging task. Although on a small scale, many of the physical processes are well described, accurate predictions of macroscopical (e.g. catchment scale) behavior with a bottom-up mechanistic approach often remains elusive. On the other hand, conceptual or purely statistical models fitted to data often perform surprisingly well for prediction. The data processing inequality, from the field of information theory, says that processing data with statistical procedures can only decrease, and not increase the information content of the data. This seems to contradict the intuition that our knowledge of physical processes should help in making informed predictions with simulation models fed by environmental data. In this paper, we propose a perspective from information theory and algorithmic information theory, to resolve this apparent contradiction and to shed light on where the information in environmental predictions originates from. Algorithmic information theory relates information content to description length and therefore enables an intuitive view of inference as a form of data compression, in which information in data is compactly represented by the patterns that can be discovered in it.
Included in
Civil Engineering Commons, Data Storage Systems Commons, Environmental Engineering Commons, Hydraulic Engineering Commons, Other Civil and Environmental Engineering Commons
The data processing inequality and environmental model prediction
Session G1: Using Simulation Models to Improve Understanding of Environmental Systems
Prediction in environmental systems, such as hydrological streamflow prediction, is a challenging task. Although on a small scale, many of the physical processes are well described, accurate predictions of macroscopical (e.g. catchment scale) behavior with a bottom-up mechanistic approach often remains elusive. On the other hand, conceptual or purely statistical models fitted to data often perform surprisingly well for prediction. The data processing inequality, from the field of information theory, says that processing data with statistical procedures can only decrease, and not increase the information content of the data. This seems to contradict the intuition that our knowledge of physical processes should help in making informed predictions with simulation models fed by environmental data. In this paper, we propose a perspective from information theory and algorithmic information theory, to resolve this apparent contradiction and to shed light on where the information in environmental predictions originates from. Algorithmic information theory relates information content to description length and therefore enables an intuitive view of inference as a form of data compression, in which information in data is compactly represented by the patterns that can be discovered in it.