Keywords
data mining, state of the art, survey, form, observatoire
Start Date
1-7-2012 12:00 AM
Abstract
In this work a proposal for making systematic state of the art is presented and applied to the Environmental Data Mining field. The main characteristics of the Data Mining process have been identified. A form has been created to check which of those characteristics take place in a real application and how. A random sample of Science Citation Index papers regarding Data Mining and Environmental Applications has been selected. Papers were read by a set of experts and a form was filled in for every paper. The resulting information was mined itself using basic statistical analysis and some specific treatments for multi-response variables, to get a first picture of what is currently being done in the applications of Data Mining methods to environmental fields. Very interesting results have been obtained which depict very useful information. This information ranges from a general picture of what kind of methods are commonly used to which environmental areas seems to be more deeply using the data mining techniques. The paper presents and discuss these results, together with a proposal for building a continuous collaborative pannel in the web for enlarging the sample of papers and update the picture continuously. This will be easily possible because the analysis of the recorded data has been automatized in a statistical package set of macros for repetitive updating mined knowledge. The proposal is oriented to provide an Environmental Data Mining Observatoire, where getting updated information on what is being done in the area, identify drawbacks, orient future research in the methodological field to provide answer to the open environmental problems and finally, to give the environmental audience a wide corpus of previous experiences to be used as a reference for new applications.
A picture on Environmental Data Mining Real Applications. What is done and how?
In this work a proposal for making systematic state of the art is presented and applied to the Environmental Data Mining field. The main characteristics of the Data Mining process have been identified. A form has been created to check which of those characteristics take place in a real application and how. A random sample of Science Citation Index papers regarding Data Mining and Environmental Applications has been selected. Papers were read by a set of experts and a form was filled in for every paper. The resulting information was mined itself using basic statistical analysis and some specific treatments for multi-response variables, to get a first picture of what is currently being done in the applications of Data Mining methods to environmental fields. Very interesting results have been obtained which depict very useful information. This information ranges from a general picture of what kind of methods are commonly used to which environmental areas seems to be more deeply using the data mining techniques. The paper presents and discuss these results, together with a proposal for building a continuous collaborative pannel in the web for enlarging the sample of papers and update the picture continuously. This will be easily possible because the analysis of the recorded data has been automatized in a statistical package set of macros for repetitive updating mined knowledge. The proposal is oriented to provide an Environmental Data Mining Observatoire, where getting updated information on what is being done in the area, identify drawbacks, orient future research in the methodological field to provide answer to the open environmental problems and finally, to give the environmental audience a wide corpus of previous experiences to be used as a reference for new applications.