NLP Cluster Analysis of Common Core State Standards and NAEP Item Specifications
Gregory Camilli, Larry Suter

TL;DR
This paper applies NLP clustering techniques to analyze the semantic relationships between Common Core standards and NAEP item specifications, assessing their construct distinctiveness and potential for improved mapping.
Contribution
It introduces a clustering-based approach to evaluate the semantic separateness of standards and specifications, providing empirical support for their classification validity.
Findings
Standards and specifications form distinct semantic clusters.
Clustering supports the construct validity of classifications.
Method demonstrates potential for mapping improvements.
Abstract
Camilli (2024) proposed a methodology using natural language processing (NLP) to map the relationship of a set of content standards to item specifications. This study provided evidence that NLP can be used to improve the mapping process. As part of this investigation, the nominal classifications of standards and items specifications were used to examine construct equivalence. In the current paper, we determine the strength of empirical support for the semantic distinctiveness of these classifications, which are known as "domains" for Common Core standards, and "strands" for National Assessment of Educational Progress (NAEP) item specifications. This is accomplished by separate k-means clustering for standards and specifications of their corresponding embedding vectors. We then briefly illustrate an application of these findings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTechnology and Data Analysis
Methodsk-Means Clustering · Sparse Evolutionary Training
