Abstract

Metalearning aims to obtain knowledge of the relationship between the mechanism of learning and the concrete contexts in which that mechanisms is applicable. As new mechanisms of learning are continually added to the pool of learning algorithms, the chances of encountering behavior similarity among algorithms are increased. Understanding the relationships among algorithms and the interactions between algorithms and tasks help to narrow down the space of algorithms to search for a given learning task. In addition, this process helps to disclose factors contributing to the similar behavior of different algorithms. We first study general characteristics of learning tasks and their correlation with the performance of algorithms, isolating two metafeatures whose values are fairly distinguishable between easy and hard tasks. We then devise a new metafeature that measures the difficulty of a learning task that is independent of the performance of learning algorithms on it. Building on these preliminary results, we then investigate more formally how we might measure the behavior of algorithms at a ner grained level than a simple dichotomy between easy and hard tasks. We prove that, among all many possible candidates, the Classifi er Output Difference (COD) measure is the only one possessing the properties of a metric necessary for further use in our proposed behavior-based clustering of learning algorithms. Finally, we cluster 21 algorithms based on COD and show the value of the clustering in 1) highlighting interesting behavior similarity among algorithms, which leads us to a thorough comparison of Naive Bayes and Radial Basis Function Network learning, and 2) designing more accurate algorithm selection models, by predicting clusters rather than individual algorithms.

Degree

PhD

College and Department

Physical and Mathematical Sciences; Computer Science

Rights

http://lib.byu.edu/about/copyright/