Abstract

Per job isolation of temporary file storage in High Performance Computing (HPC) environments provide benefits in security, efficiency, and administration. HPC system administrators can use the mount_isolation Slurm task plugin to improve security by isolating temporary files where no isolation previously existed. The mount_isolation plugin also increases efficiency by removing obsolete temporary files immediately after each job terminates. This frees valuable disk space in the HPC environment to be used by other jobs. These two improvements reduce the amount of work system administrators must expend to ensure temporary files are removed in a timely manner.Previous temporary file removal solutions were removal on reboot, manual removal, or removal through a Slurm epilog script. The epilog script was the most effective of these, allowing files to be removed in a timely manner. However, HPC users can have multiple supercomputing jobs running concurrently. Temporary files generated by these concurrent or overlapping jobs are only deleted by the epilog script when all jobs run by that user on the compute node have completed. Even though the user may have only one running job, the temporary directory may still contain temporary files from many previously executed jobs, taking up valuable temporary storage on the compute node. The mount_isolation plugin isolates these temporary files on a per job basis allowing prompt removal of obsolete files regardless of job overlap.

Degree

MS

College and Department

Ira A. Fulton College of Engineering and Technology; Technology

Rights

http://lib.byu.edu/about/copyright/

Date Submitted

2018-06-01

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd10066

Keywords

bind mounts, mount namespaces, Slurm, supercomputing, HPC, temporary storage

Language

english

Share

COinS