Abstract

Data that is represented with high dimensionality presents a computational complexity challenge for many existing algorithms. Limiting dimensionality by discarding attributes is sometimes a poor solution to this problem because significant high-level concepts may be encoded in the data across many or all of the attributes. Non-linear dimensionality reduction (NLDR) techniques have been successful with many problems at minimizing dimensionality while preserving intrinsic high-level concepts that are encoded with varying combinations of attributes. Unfortunately, many challenges remain with existing NLDR techniques, including excessive computational requirements, an inability to benefit from prior knowledge, and an inability to handle certain difficult conditions that occur in data with many real-world problems. Further, certain practical factors have limited advancement in NLDR, such as a lack of clarity regarding suitable applications for NLDR, and a general inavailability of efficient implementations of complex algorithms.

This dissertation presents a collection of papers that advance the state of NLDR in each of these areas. Contributions of this dissertation include:

• An NLDR algorithm, called Manifold Sculpting, that optimizes its solution using graduated optimization. This approach enables it to obtain better results than methods that only optimize an approximate problem. Additionally, Manifold Sculpting can benefit from prior knowledge about the problem.

• An intelligent neighbor-finding technique called SAFFRON that improves the breadth of problems that existing NLDR techniques can handle.

• A neighborhood refinement technique called CycleCut that further increases the robustness of existing NLDR techniques, and that can work in conjunction with SAFFRON to solve difficult problems.

• Demonstrations of specific applications for NLDR techniques, including the estimation of state within dynamical systems, training of recurrent neural networks, and imputing missing values in data.

• An open source toolkit containing each of the techniques described in this dissertation, as well as several existing NLDR algorithms, and other useful machine learning methods.

Degree

PhD

College and Department

Physical and Mathematical Sciences; Computer Science

Rights

http://lib.byu.edu/about/copyright/

Date Submitted

2012-05-18

Document Type

Dissertation

Handle

http://hdl.lib.byu.edu/1877/etd5241

Keywords

non-linear dimensionality reduction, manifold learning, intrinsic variables, state estimation, imputation, neighbor selection, neighborhood refinement

Language

English

Share

COinS