Presenter/Author Information

G. B. Kingston
Holger R. Maier
M. F. Lambert

Keywords

artificial neural networks, input selection, pruning, bayesian, environmental modelling

Start Date

1-7-2004 12:00 AM

Abstract

Artificial neural networks (ANNs) provide a useful and effective tool for modelling poorly understood and complex processes, such as those that occur in nature. However, developing an ANN to properly model the desired relationship is not a trivial task. Selection of the correct causal inputs is one of the most important tasks faced by neural network practitioners, but as knowledge regarding the relationships modelled by ANNs is generally limited, selecting the appropriate inputs is also one of the most difficult tasks in the development of an ANN. Many of the methods available for assessing the significance of potential input variables do not consider the uncertainty or variability associated with the input relevance measures used and, consequently, this important factor is neglected during hypothesis testing. In this paper a modelbased method is presented for pruning ANN inputs, based on the statistical significance of the relationship between the input variables and the response variable. The approach uses Bayesian methods to estimate the input relevance measure such that the uncertainty associated with this parameter can be quantified and hypothesis testing can be carried out in a straightforward and statistical manner. The proposed methodology is applied to a synthetically generated data set and it is found to successfully identify the 3 relevant inputs that were used to generate the data from 15 possible input variables that were originally entered into the ANN.

COinS
 
Jul 1st, 12:00 AM

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling

Artificial neural networks (ANNs) provide a useful and effective tool for modelling poorly understood and complex processes, such as those that occur in nature. However, developing an ANN to properly model the desired relationship is not a trivial task. Selection of the correct causal inputs is one of the most important tasks faced by neural network practitioners, but as knowledge regarding the relationships modelled by ANNs is generally limited, selecting the appropriate inputs is also one of the most difficult tasks in the development of an ANN. Many of the methods available for assessing the significance of potential input variables do not consider the uncertainty or variability associated with the input relevance measures used and, consequently, this important factor is neglected during hypothesis testing. In this paper a modelbased method is presented for pruning ANN inputs, based on the statistical significance of the relationship between the input variables and the response variable. The approach uses Bayesian methods to estimate the input relevance measure such that the uncertainty associated with this parameter can be quantified and hypothesis testing can be carried out in a straightforward and statistical manner. The proposed methodology is applied to a synthetically generated data set and it is found to successfully identify the 3 relevant inputs that were used to generate the data from 15 possible input variables that were originally entered into the ANN.