Abstract
When evaluating student learning, educators often employ scoring rubrics, for which quality can be determined through evaluating validity and reliability. This article discusses the norming process utilized in a graduate organizational leadership program for a capstone scoring rubric. Concepts of validity and reliability are discussed, as is the development of a scoring rubric. Various statistical measures of inter-rater reliability are presented and effectiveness of those measures are discussed. Our findings indicated that inter-rater reliability can be achieved in graduate scoring rubrics, though the strength of reliability varies substantially based on the selected statistical measure. Recommendations for determining validity and measuring inter-rater reliability among multiple raters and rater pairs in assessment practices, among other considerations in rubric development, are provided.
Document Type
Article
Source Publication
Research & Practice in Assessment
Version
Published Version
Publication Date
2023
Volume
18
Issue
2
First Page
31
Last Page
41
Rights
© Research & Practice in Assessment
Recommended Citation
Goertzen, B. J., & Klaus, K. (2023). Is it actually reliable? Examining statistical methods for inter-rater reliability of rubrics in graduate education. Research & Practice in Assessment, 18(2), 31-41. https://www.rpajournal.com/is-it-actually-reliable-examining-statistical-methods-for-inter-rater-reliability-of-a-rubric-in-graduate-education/
Included in
Curriculum and Instruction Commons, Educational Assessment, Evaluation, and Research Commons, Higher Education Commons
Comments
For questions contact ScholarsRepository@fhsu.edu