Abstract

During this "white collar recession,'' there is a flooded labor market of workers. For employers seeking to hire, there is a need to identify potential qualified candidates for each job. The current state of the art is LinkedIn Recruiting or elastic search on Resumes. The current state of the art lacks efficiency and scalability along with an intuitive ranking of candidates. We believe this can be fixed with multi-layer categorical clustering via modularity maximization. To test this, we gathered a dataset that is extensive and representative of the job market. Our data comes from PeopleDataLabs and LinkedIn and is sampled from 153 million individuals. As such, this data represents one of the most informative datasets for the task of ranking and clustering job titles and skills. Properly grouping individuals will help identify more candidates to fulfill the multitude of vacant positions. We implement a novel framework for categorical clustering, involving these attributes to deliver a reliable pool of candidates. We develop a metric for clustering based on commonality to rank clustering algorithms. The metric prefers modularity-based clustering algorithms like the Louvain algorithm. This allows us to use such algorithms to outperform other unsupervised methods for categorical clustering. Our implementation accurately clusters emergency services, health-care and other fields while managerial positions are interestingly swamped by soft or uninformative features thereby resulting in dominant ambiguous clusters.

Degree

MS

College and Department

Physical and Mathematical Sciences; Mathematics

Rights

https://lib.byu.edu/about/copyright/

Date Submitted

2023-04-10

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd12703

Keywords

Categorical Clustering, Modularity Maximization, Unsupervised Clustering, Louvain, Sparse, Categorical Data

Language

english

Share

COinS