A Multi-Faceted Approach to Scrutinizing the Reliability of a Measure of STEM Teacher Strategic Knowledge
Robert M. Talbot III

TL;DR
This study thoroughly examines the reliability of a STEM teacher strategic knowledge measure using multiple methods, highlighting areas for improving measurement precision based on instrument design and context.
Contribution
It introduces a comprehensive, multi-faceted approach to assessing score reliability, combining rater agreement, classical test theory, and Generalizability Theory.
Findings
Identifies key sources of measurement error
Highlights where reliability can be improved
Provides detailed insights into measurement precision
Abstract
Score reliability is necessary for establishing a validity argument for an instrument, and is therefore highly important to investigate. Depending on the proposed instrument use and score interpretations, differing degrees of precision in measurement or reliability are required. Researchers sometimes fail to take a critical stance when investigating this important measurement property, and default to accepted values of commonly known measures. This study takes a multi-faceted approach to scrutinizing score reliability from a measure of STEM teacher strategic knowledge using rater agreement, classical test theory conceptions of reliability, and Generalizability Theory. This detailed examination provides insight about where the greatest gains in score reliability can be realized, given the design of the instrument and the context of measurement.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStudent Assessment and Feedback · Educational Technology and Assessment · Educational Assessment and Improvement
