Please note that using the research files provided at this site requires
expertise in the management of large data files. These files can range from 1MB
up to 90MB and more and take many hours to download if you are using a 56kb
Working with these research files requires advanced data management skills. Many of the district and county research files are too large for spreadsheet applications such as MS Excel and Lotus. Database applications like MS Access, SAS, or SPSS will be required to fully manage these research files.
For each entity (school, district, county, or state), there are on average 900 records. Each record represents a different combination of demographic subgroups, grade levels, and test types. With so many records per entity, it is critical that the desired combination of characteristics is accurately selected.
Copying individual report pages into a spreadsheet application is possible if
the target computer is using the most current operating systems and spreadsheet
The Research files contain the aggregate score data for the California Standards Tests (CSTs), California Alternate Performance Assessment (CAPA), and California Achievement Tests, Sixth Edition Survey (CAT/6 Survey). The research files are available in three formats: XML, fixed width and comma delimited. A statewide research file containing the state, county, district, and school data for All Students (no demographic subgroup data) will be available in all three formats. In addition, a similar statewide research file containing the data for All Sub-groups is available in each format.
Files can also be downloaded for any single county or district. These files contain all data (all subgroups and tests) for all entities comprising the selected entity. For example, if a district file is selected, the data for all schools in that district will be included in the file. The research files are comma delimited and zipped to allow easier download and file import management. School only files are not available.
The 2006 Entities File contains all school, district, and county names. This file must be merged with the research file to join these entity names with the appropriate score data. A database program such as MS Access is most appropriate for this purpose.
Research file layouts and value lookup tables are available at Research File Layout.
The Research File Layout link provides the following information:
Users of comma delimited research files will find these layouts useful in confirming the sequence of elements as well as value lookup. Users may view and/or download any of the layouts and tables.
Also available from the Research File Layout page are two additional comma delimited lookup files:
Both of these lookup tables are useful when associating test and subgroup IDs and names with codes in the XML, comma delimited, or fixed width test data file.
A database shell is another alternative provided at this site. Once downloaded to the target computer, this application provides a powerful school, district, CDS, and ZIP code search capability as well as a formatted report containing all the data for the selected entity. This MS Access 2000 shell contains all entity data and is designed to import any of the selected state, county, or district comma delimited files. MS Access 2000 must already be installed on your computer.
Downloading Instructions for PC Users
Downloading Instructions for Mac Users
Downloading the 2006 Access Database Shell (Note: MS Access2000 must already be installed on the target computer)
In both the Search Panel and on the Research Files description page, three search lists are identified:
Select the list corresponding to the data you wish to download. The resulting list will be alphabetical and give you the option of viewing the report or downloading the research data. Double click on your selection and use the directions above to complete the downloading of your data.
The Search button to the left of the search panel also provides a powerful search tool. Selecting the Search link returns a search form. You can enter any combination of elements into the form and return all schools that meet that criterion. These elements include:
When working with these research files, achieving accurate results requires an understanding of the structure and content of the two primary tables: the entities and the test data tables. The research files have many rows for each entity. There are records for each combination of 11 grades (includes end-of-course as a grade), 29 tests, and 51 subgroups. This means that there are hundreds to thousands of records for each entity, with an average of approximately 900 records. In order to correctly work with the data, you must use constraints to limit the data you are reporting. These constraints are discussed below.
2006 Entities Table This table is comprised of the state, all counties, districts, and schools in California. Because there are both school level and district summary records, as well as county and state summary records, it is critical that in any analysis, a Type ID record type be selected. This will help avoid double or triple counting that will occur when a school count is also counted in the associated district record.
Test Data Table This table is comprised of the school, district, county, and state aggregate STAR counts and scores.
To accurately analyze and report from these research files, the appropriate constraints must be applied to the following elements:
Providing accurate and meaningful reports from the research files generally requires the linking of the 2006 Entities and Test Data tables. Additional efforts might include linking to the lookup tables discussed above. Working with these tables requires an understanding of relational data tables and their manipulation.