STAR Help
Comparing California Standards Test Scores
Any user of the California Standards Test results should compare scores only within the
same subject and grade. That is grade 2 English-language arts compared
to grade 2 English-language arts or grade 6
mathematics compared to grade 6 mathematics. No direct
comparisons should be made between grades or between content areas.
Two types of comparisons are possible: 1) comparing the average
scaled score or 2) comparing the percent of students scoring at each
performance level. The reviewer may compare schools across years
within a school, between schools, or between a school and its
district, county, or state. When making comparisons, the reviewer
should consider comparing the percentage of students scoring
proficient and advanced, since the state target is for all students
to score at or above proficient.
Comparing CAT/6 Scores
Same Year, Within School Comparisons A reviewer may
want to compare the performance of students at different grade
levels within a school. Similarities and differences in student
performance in the same subject may be seen by comparing the percent
of students scoring at or above the 50th percentile for each grade.
When making this comparison it is important to remember that the
number of students in each group affects the confidence in the
inferences that can be made. The smaller the group the more cautious
one should be in making comparisons. It is also important to note
that the national norm groups to which California's students' scores
are compared were unique for each grade level.
Same Year, Between School or Between School, District and
State Comparisons When comparing the performance of students
between schools or between a school, its district, and the state,
the reviewer may compare the average scaled scores within a content
area such as reading, language or mathematics. The most defensible
comparison is the Percent of Students Scoring At or Above the 50th
NPR within each grade and content area.
Comparing 2003 CAT/6 Scores with Stanford 9 Scores
Users should make no direct comparisons between the 2003 Cat/6 scores
and the Stanford 9 scores from previous years posted on the
Internet. Different publishers developed the two test series, and
the CAT/6 tests were developed and normed more than five years after
the Stanford 9. Making direct comparisons between the scores for the
two test series is inappropriate, because they have different
formats and difficulty levels. Score tables that can be used to
compare results between the two tests will be available at
www.startest.org during fall 2003.
Comparing Stanford 9 Scores
Comparing Group Test Results Since the
Stanford 9 was unchanged from year-to-year and was administered in
all California public schools, users may make comparisons from
year-to-year as well as between and among schools and/or districts
for the tests administered between spring 1998 and spring 2002. The
most defensible comparison is the Percent of Students Scoring At or
Above the 50th NPR. This is the percentage of students in the group
purported to have demonstrated achievement at or above grade level
on this particular test. A number of comparisons are possible, each
with its own set of cautions.
Same Year, Within School Comparisons A
reviewer may want to compare the performance of students at
different grade levels within a school. Similarities and differences
in student performance in the same subject may be seen by comparing
the percent of students scoring at or above the 50th percentile for
each grade. When making this comparison it is important to remember
that the number of students in each group affects the confidence in
the inferences that can be made. The smaller the group the more
cautious one should be in making comparisons. It is also important
to note that the national norm groups to which California's
students' scores are compared were unique for each grade level.
Same School, Different Years Comparisons
There are two ways to compare two or more years of data for the
same school. One can look at a cohort comparison over time by
following a group of students from grade-to-grade. For example, if
48% of a school's third graders scored at or above the 50th NPR in
1999, 51% of the fourth graders scored at or above the 50th NPR in
2000, and 54% of the fifth graders scored at or above the 50th NPR
in 2001, the school appears effective in improving the achievement
of this group of students. When making this comparison, it is
important to understand that even if the number of students is the
same from year-to-year, that the group's composition may be quite
different if student mobility (transiency) is high.
In a cross-sectional comparison, third-grade results are compared
from year to year. Since the results for two-separate groups of
students are being compared, differences that may exist between the
two groups should be considered.
Scaled Score Comparisons (Cohort)
While scaled scores cannot be compared between different
tests or subject areas, scaled scores are useful for comparing
performance over time on the same test for the Stanford 9. For
example, if the second grade in a school had 52% of the students
score at or above the 50th percentile and 52% of the students also
scored at or above the 50th percentile in third grade, a comparison
of the average scaled scores may be used to determine if the
students actually demonstrated growth during the year. If a group
maintains the same position relative to the norm group, the average
scaled score will increase, because the average scaled scores for
the norm groups increase from year-to-year.
|