This paper examines fundamental concepts in psychological and educational assessment, beginning with the distinction between norm-referenced and criterion-referenced tests. It then explores the qualities that define adequate norming groups, illustrating the complexity of group selection through the Wechsler Intelligence Scale for Children (WISC) and its application to Taiwanese children with ADHD. The paper proceeds to describe the four measurement scales—nominal, ordinal, interval, and ratio—and their relevance to counseling assessments. It concludes with a practical application example using the Montreal Cognitive Assessment (MoCA), demonstrating how age and education variables influence score interpretation and the importance of locally derived norming groups.
The difference between norm- and criterion-referenced tests is that the former compares test scores to a reference group, while the latter compares test scores to a performance standard. Norm-referenced tests are quite common. For example, student reading performance in primary schools may be compared to the mean score for all children of the same age. The norm comparison group would likely consist of all students within a school district, state, or nation who took the same test at the same age. Students who scored lower or higher than the mean for the norm reference group would be ranked as low or high achievers. Imagine, however, if someone wishing to qualify for a motor vehicle license was only required to achieve a score close to the mean score for all drivers. Using a norm-referenced driver test would likely be a poor public safety choice, especially if there are many bad drivers on the road.
By comparison, public safety would be better served if all licensed drivers were required to understand 90% of road signs, be able to parallel park, and could navigate a complex and busy intersection without any problems. These represent standards of performance, and therefore driver's tests are typically criterion-referenced tests. When using a criterion-referenced test, it does not matter whether the majority of the population performs more poorly or better than the reference standard, because the standard is not tied to population performance statistics. This is probably the most important difference between norm- and criterion-referenced tests, because the performance of the norm-referenced group may change over time, thereby altering the performance standards of the test. By comparison, the reference standards on a criterion-referenced test will not change, regardless of changes in the sampled population.
There are no universal standards that describe what a good norming group must be; however, the selection of an adequate reference group will depend on the demographic being assessed, the goal of the assessment, and how the testing results will be used. Although selection of the norming group depends largely on who is conducting the testing, all norming groups should be adequately described to facilitate the performance testing being done and to provide enough information for other researchers interested in using the norming group for their own needs. Other considerations include a group size sufficient to create enough statistical power for meaningful comparisons. Norming groups are often minimally described using the demographic variables of age, gender, ethnicity, education, and income.
When children are administered the Wechsler Intelligence Scale for Children (WISC), the scores obtained are compared to mean test scores of children at the same age (School Psychologist Files, n.d.). The means were obtained by having thousands of children take the test, which implies that the intelligence measured by the WISC is evaluated in comparison to norming groups stratified by age. When Yang and colleagues (2013) administered the WISC, version IV, to Taiwanese school children with attention deficit hyperactivity disorder (ADHD), they were comparing the scores to norming groups from China. Adequate norming groups for this study would have been Chinese children stratified by age; however, the authors expressed some concern about the validity of the comparison between WISC scores obtained by Taiwanese children and those obtained by children living in mainland China, due to cultural differences. This example illustrates how complex the qualities of a norming group can be and how important it is to select an appropriate norming group for a specific comparison.
"Four scales: nominal, ordinal, interval, and ratio explained"
"MoCA norming for age and education variables"
You’re 46% through this paper. Sign up to read the remaining 2 sections.
Sign Up Now — Instant Access Already a member? Log inAlways verify citation format against your institution’s current style guide requirements.