In order to continuously improve teaching and learning of a language program, it is a crucial part of program evaluation to assure that its assessment instruments have a beneficial influence on the teaching and learning procedures. For that reason, evidence was gathered to investigate the validity of test scores of the German Proficiency Exam (GPE) used by the German Section of the Germanic and Slavic Languages Department at Brigham Young University. The GPE consists of seven exam components: Listening comprehension, reading, writing, speaking, grammar, vocabulary, and strong verbs. The GPE component scores of 179 students were used to conduct the analysis for this study. In order to estimate the reliability of the test scores, Cronbach's Alpha was calculated for the listening comprehension exam, the reading exam, the grammar exam, the strong verbs exam, and the vocabulary exam. In addition, the analysis included overall descriptive statistics, item facility and item discrimination, distractor analysis, ANOVA, and a post-hoc Tukey's pairwise comparison. The results of the Cronbach's alpha indicated relatively high reliability of scores of all the exam components except the listening component. The item and distractor analysis of the strong verbs and vocabulary exam revealed that the scoring procedures need to be revised so that the scores reflect a student's true knowledge. The descriptive statistics of the exams showed a limited usage of the scoring range and suggest defining the scoring procedures and training the scorers. Further, it was suggested to define general language construct and the specific construct of each language skill on the basis of which proficiency levels can be developed. Using the results of the data analysis. various suggestions were given to improve the validity of scores of the GPE.



validity, reliability, german, german proficiency exam, language testing, program evaluation, language ability, language construct, messick, test scores, speaking, listening, writing, reading, grammar, strong verbs, vocabulary, language assessment, test analysis