Keywords
classification, data mining, defoliation, image segmentation
Start Date
1-7-2006 12:00 AM
Abstract
Experimental data mining and image segmentation approaches are developed to add insight towards aerial image interpretation for defoliation survey procedures. A decision tree classifier generated with a data mining package, WEKA [Witten and Frank, 2005], based on the contents of a small number of training data points, identified from known classes, is used to predict the extents of regions containing different levels of tree mortality (severe, moderate, light and non attack) and land cover (vegetation and ground surface). This approach is applicable to low quality imagery without traditional image pre-processing (e.g., normalization or noise reduction). To generate the decision tree, the image is split into 20 ´ 20 pixel tiles and data points are created for each tile from peaks of smoothed histograms of red, green and blue colour channels, and their average. Colour channel peaks are examined to verify that histogram peaks effectively represent tree mortality, and to select an initial small training data set. Next, two small training data sets are selected randomly, to model the real-world training data selection process. Decision trees are generated using these training sets and tested on the remaining data. Stratified cross-validation is then performed on the full dataset, for comparison. The classification accuracy is 75% for cross validation and 31-49% for smaller training data sets. Assigning lower penalties for less severe errors gives a weighted accuracy of 79% for cross validation, 72% for manually selected and 48-65% for randomly selected training data. For comparison, the classification accuracy of the image segmentation method is 84%. Performance on small training sets still needs to be improved, although encouraging results were achievable with well identified heterogeneous training data.
Data mining and image segmentation approaches for classifying defoliation in aerial forest imagery
Experimental data mining and image segmentation approaches are developed to add insight towards aerial image interpretation for defoliation survey procedures. A decision tree classifier generated with a data mining package, WEKA [Witten and Frank, 2005], based on the contents of a small number of training data points, identified from known classes, is used to predict the extents of regions containing different levels of tree mortality (severe, moderate, light and non attack) and land cover (vegetation and ground surface). This approach is applicable to low quality imagery without traditional image pre-processing (e.g., normalization or noise reduction). To generate the decision tree, the image is split into 20 ´ 20 pixel tiles and data points are created for each tile from peaks of smoothed histograms of red, green and blue colour channels, and their average. Colour channel peaks are examined to verify that histogram peaks effectively represent tree mortality, and to select an initial small training data set. Next, two small training data sets are selected randomly, to model the real-world training data selection process. Decision trees are generated using these training sets and tested on the remaining data. Stratified cross-validation is then performed on the full dataset, for comparison. The classification accuracy is 75% for cross validation and 31-49% for smaller training data sets. Assigning lower penalties for less severe errors gives a weighted accuracy of 79% for cross validation, 72% for manually selected and 48-65% for randomly selected training data. For comparison, the classification accuracy of the image segmentation method is 84%. Performance on small training sets still needs to be improved, although encouraging results were achievable with well identified heterogeneous training data.