Abstract

Increasingly more users are accessing database systems for interactive and exploratory data retrieval. While performing searches on these systems, users are required to use broad queries to get their desired results. Broad queries often result in too many items forcing the user to spend unnecessary time sifting through these items to find the relevant results. This problem, of finding a desired data item within many items, is referred to as "information overload". Most users experience information overload when viewing these database query results. This thesis shows that users information overload can be reduced by clustering database query results. A hierarchical agglomerative clustering algorithm is used to cluster the query results. The reduction of users information overload is evaluated using Chakrabarti et al information overload cost model. Empirical results show that users are able to find more relevant information as well as experiencing a reduction in information overload.

Degree

College and Department

Physical and Mathematical Sciences; Computer Science

Rights

http://lib.byu.edu/about/copyright/