Artifact-based research provides a mechanism whereby researchers may study the creation of software yet avoid many of the difficulties of direct observation and experimentation. Open source software forges are of great value to the software researcher, because they expose many of the artifacts of software development. However, many challenges affect the quality of artifact-based studies, especially those studies examining software evolution. This thesis addresses one of these threats: the presence of very large commits, which we refer to as "Cliff Walls." Cliff walls are a threat to studies of software evolution because they do not appear to represent incremental development. In this thesis we demonstrate the existence of cliff walls in open source software projects and discuss the threats they present. We also seek to identify key causes of these monolithic commits, and begin to explore ways that researchers can mitigate the threats of cliff walls.
College and Department
Physical and Mathematical Sciences; Computer Science
BYU ScholarsArchive Citation
Pratt, Landon James, "Cliff Walls: Threats to Validity in Empirical Studies of Open Source Forges" (2013). All Theses and Dissertations. 3511.
software engineering, open source software, cliff walls, software evolution, version control, repository mining, latent dirichlet allocation, SourceForge, Apache Foundation, artifacts