Presenter/Author Information

Karina Gibert
Miquel Sànchez-Marrè
Víctor Codina

Keywords

knowledge discovery from databases, data mining, intelligent decision support system, case-based reasoning

Start Date

1-7-2010 12:00 AM

Abstract

One of the most difficult tasks in the whole KDD process is to choose the right data mining technique, as the commercial software tools provide more and more possibilities together and the decision requires more and more expertise on the methodological point of view. Indeed, there are a lot of data mining techniques available for an environmental scientist wishing to discover some model from her/his data. This diversity can cause some troubles to the scientist who often have not a clear idea of what are the available methods, and moreover, use to have doubts about the most suitable method to be applied to solve a concrete domain problem. Within the data mining literature there is not a common terminology. A classification of the data mining methods would greatly simplify the understanding of the whole space of available methods. Furthermore, most data mining products either do not provide intelligent assistance for addressing the data mining process or tend do so in the form of rudimentary “wizard-like” interfaces that make hard assumptions about the user’s background knowledge. In this work, a classification of most common data mining methods is presented in a conceptual map which makes easier the selection process. Also an intelligent data mining assistant is presented. It is oriented to provide model/algorithm selection support, suggesting the user the most suitable data mining techniques for a given problem.

COinS
 
Jul 1st, 12:00 AM

Choosing the Right Data Mining Technique: Classification of Methods and Intelligent Recommendation

One of the most difficult tasks in the whole KDD process is to choose the right data mining technique, as the commercial software tools provide more and more possibilities together and the decision requires more and more expertise on the methodological point of view. Indeed, there are a lot of data mining techniques available for an environmental scientist wishing to discover some model from her/his data. This diversity can cause some troubles to the scientist who often have not a clear idea of what are the available methods, and moreover, use to have doubts about the most suitable method to be applied to solve a concrete domain problem. Within the data mining literature there is not a common terminology. A classification of the data mining methods would greatly simplify the understanding of the whole space of available methods. Furthermore, most data mining products either do not provide intelligent assistance for addressing the data mining process or tend do so in the form of rudimentary “wizard-like” interfaces that make hard assumptions about the user’s background knowledge. In this work, a classification of most common data mining methods is presented in a conceptual map which makes easier the selection process. Also an intelligent data mining assistant is presented. It is oriented to provide model/algorithm selection support, suggesting the user the most suitable data mining techniques for a given problem.