clusters, inefficiencies, hardware drift
While clusters have already proven themselves in the world of high performance computing, some clusters are beginning to exhibit resource inefficiencies due to increasing hardware diversity. Much of the success of clusters lies in the use of commodity components built to meet various hardware standards. These standards have allowed a great level of hardware backwards compatibility that is now resulting in a condition referred to as hardware 'drift' or heterogeneity. The hardware heterogeneity introduces problems when diverse compute nodes are allocated to a parallel job, as most parallel jobs are not self-balancing. This paper presents a new method that allows the batch scheduling system to intelligently select the best resource set for a parallel job in order to minimize the adverse effects of hardware drift and increase overall performance of the cluster. The performance improvements of this technique are evaluated in terms of parallel job efficiency and scheduling resource utilization and overall system performance. Using the emulation capabilities of the Maui Scheduler, this paper evaluates a number of variations of the resource set allocation algorithm on true cluster throughput and utilization using a recorded trace workload from a production cluster.
Original Publication Citation
Improving Cluster Utilization Through Set-Based Allocation Policies, David B. Jackson, Brian Haymore, Julio Facelli, Quinn O. Snell. Proceedings of the International Conference on Parallel Processing Workshops, Valencia, Spain, September 21.
BYU ScholarsArchive Citation
Snell, Quinn O.; Facelli, Julio C.; Haymore, Brian D.; and Jackson, David B., "Improving Cluster Utilization Through Set Based Allocation Policies" (2001). All Faculty Publications. 564.
Physical and Mathematical Sciences
© 2001 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Copyright Use Information