Stream C: Processing environmental information including data mining, machine learning, GIS, remote sensing

Using Machine Learning to Make the Most out of Free Data: A Deforestation Case Study

Helen Mayfield, University of QueenslandFollow
Carl Smith, University of QueenslandFollow
Marcus Gallagher, University of QueenslandFollow
Lauren Coad, University of Oxford, Environmental Change InstituteFollow
Marc Hockings, The University of QueenslandFollow

Keywords

Machine learning; deforestation; freely available datasets

Location

Session C1: VI Data Mining for Environmental Sciences Session

Start Date

12-7-2016 11:50 AM

End Date

12-7-2016 12:10 PM

Abstract

Deforestation remains a major environmental issue and is studied for a range of reasons such as informing policy decisions, predicting at risk areas and evaluating interventions. Despite there being a range of machine learning (ML) techniques with a proven record in complicated, non-linear problems, many of these are rarely applied to deforestation analysis. We propose that this is partially due to uncertainty in the environmental services field regarding how these models perform compared to standard statistics. There is also a lack of guidance on which situations various models are suitable for. We compared the three ML techniques of artificial neural networks (ANNs), Bayesian networks (BNs) and Gaussian Processes (GP’s) against classical generalised linear models (GLMs) and generalised linear mixed models (GLMMs). Each technique was evaluated using several performance metrics as well as being assessed on their suitability for meeting three core objective requirements of deforestation studies; predicting location, predicting quantity and identifying predisposing factors. Constraints such as implementation time and difficulty were also considered. The datasets used for model training were restricted to freely available or low cost datasets to allow evaluation of their potential usefulness. All models were able to provide good general predictions of the location of deforestation. None of the techniques implemented using the selected datasets were suitable for directly predicting the amount of deforestation, however the GLMMs and BNs were useful in predicting deforestation risk and assessing the relative importance of deforestation predictors. GPs performed well when few deforestation predictors were available and the ANNs in some instances outperformed the GLMMs. The available resources were found to be a major influence when deciding which techniques are suitable for a given study.

Download

Included in

Civil Engineering Commons, Data Storage Systems Commons, Environmental Engineering Commons, Hydraulic Engineering Commons, Other Civil and Environmental Engineering Commons

COinS

Jul 12th, 11:50 AM Jul 12th, 12:10 PM

Using Machine Learning to Make the Most out of Free Data: A Deforestation Case Study

Session C1: VI Data Mining for Environmental Sciences Session

Stream C: Processing environmental information including data mining, machine learning, GIS, remote sensing

Using Machine Learning to Make the Most out of Free Data: A Deforestation Case Study

Keywords

Location

Start Date

End Date

Abstract

Included in

Conference Links

Search

BYU

BYU Links

Links

Stream C: Processing environmental information including data mining, machine learning, GIS, remote sensing

Using Machine Learning to Make the Most out of Free Data: A Deforestation Case Study

Presenter/Author Information

Keywords

Location

Start Date

End Date

Abstract

Included in

Share

Conference Links

Search

BYU

BYU Links

Links