Keywords

machine learning, modelling climatic data, time series

Start Date

1-7-2002 12:00 AM

Abstract

A model to characterize and predict continuous time series from machine learning techniques isproposed. This model includes the following three steps: dynamic discretization of continuous values,construction of probabilistic finite automata and prediction of new series with randomness. The first problemin most models from machine learning is that they are developed for discrete values; however, mostphenomena in nature are continuous. To convert these continuous values into discrete values a dynamicdiscretization method has been used. With the obtained discrete series, we have built probabilistic finiteautomata which include all the representative information which the series contain. The use of probabilisticfinite automata allows us to consider, in an easy way, the different relationships between the values in theseries for different environmental conditions. The learning algorithm to build these automata is polynomial inthe sample size. An algorithm to predict new series has been proposed. This algorithm incorporates therandomness in nature: values are generated using the cumulative probability distribution function -included inthe automata- and a random number to select the new value. After finishing the three steps of the model, thesimilarity between the predicted series and the real ones has been checked. For this, a new adaptable testbased on the classical Kolmogorov-Smirnov two-sample test has been developed; this test takes into accountthe continuous nature of climatic data. The cumulative distribution function of observed and generated serieshas been compared using the concept of indistinguishable values. Finally, the proposed model has beenapplied in a practical cases: the study of hourly global solar radiation series.

COinS
 
Jul 1st, 12:00 AM

Probabilistic Finite Automata and Randomness in Nature: a New Approach in the Modelling and Prediction of Climatic Parameters

A model to characterize and predict continuous time series from machine learning techniques isproposed. This model includes the following three steps: dynamic discretization of continuous values,construction of probabilistic finite automata and prediction of new series with randomness. The first problemin most models from machine learning is that they are developed for discrete values; however, mostphenomena in nature are continuous. To convert these continuous values into discrete values a dynamicdiscretization method has been used. With the obtained discrete series, we have built probabilistic finiteautomata which include all the representative information which the series contain. The use of probabilisticfinite automata allows us to consider, in an easy way, the different relationships between the values in theseries for different environmental conditions. The learning algorithm to build these automata is polynomial inthe sample size. An algorithm to predict new series has been proposed. This algorithm incorporates therandomness in nature: values are generated using the cumulative probability distribution function -included inthe automata- and a random number to select the new value. After finishing the three steps of the model, thesimilarity between the predicted series and the real ones has been checked. For this, a new adaptable testbased on the classical Kolmogorov-Smirnov two-sample test has been developed; this test takes into accountthe continuous nature of climatic data. The cumulative distribution function of observed and generated serieshas been compared using the concept of indistinguishable values. Finally, the proposed model has beenapplied in a practical cases: the study of hourly global solar radiation series.