Abstract
We present INCREMENT, a cluster refinement algorithm which utilizes user feedback to refine clusterings. INCREMENT is capable of improving clusterings produced by arbitrary clustering algorithms. The initial clustering provided is first sub-clustered to improve query efficiency. A small set of select instances from each of these sub-clusters are presented to a user for labelling. Utilizing the user feedback, INCREMENT trains a feature embedder to map the input features to a new feature space. This space is learned such that spatial distance is inversely correlated with semantic similarity, determined from the user feedback. A final clustering is then formed in the embedded space. INCREMENT is tested on 9 datasets initially clustered with 4 distinct clustering algorithms. INCREMENT improved the accuracy of 71% of the initial clusterings with respect to a target clustering. For all the experiments the median percent improvement is 27.3% for V-Measure and is 6.08% for accuracy.
Degree
MS
College and Department
Physical and Mathematical Sciences; Computer Science
Rights
http://lib.byu.edu/about/copyright/
BYU ScholarsArchive Citation
Mitchell, Logan Adam, "INCREMENT - Interactive Cluster Refinement" (2016). Theses and Dissertations. 5795.
https://scholarsarchive.byu.edu/etd/5795
Date Submitted
2016-03-01
Document Type
Thesis
Handle
http://hdl.lib.byu.edu/1877/etd8330
Keywords
Clustering, Cluster Refinement, Active Learning, User Feedback, Human in the Loop
Language
english