Data Archive

The Stanford Education Data Archive (SEDA) includes a number of publicly available data files, the technical documentation and data codebooks, listed on the following page. Data files are available in Stata (v13) and .csv formats.

In publications, please cite the data as:
Sean F. Reardon, Andrew D. Ho., Benjamin R. Shear, Erin M. Fahle, Demetra Kalogrides, & Richard DiSalvo. (2018). Stanford Education Data Archive (Version 2.1). http://purl.stanford.edu/db586ns4974.

If you have questions or note errors in the data, please contact us at sedasupport@stanford.edu

Notes

The currently available data include district and county level average achievement (for all students and by race/ethnicity and gender), district and county level racial/ethnic and gender achievement gaps, and district level demographic/socioeconomic data. The most recent release (currently, Version 2.1) should always be used for reporting and analysis. Previous versions of the data are still available to facilitate research replication. Please review the technical documentation and codebooks that accompany the data sets. These documents review the data construction process and describes the contents of each file.

Version 1.0
Data description Download Documentation
This file contains district level means in grade equivalent units. There are multiple observations per district; one for each year, grade and subject. Stata Excel CSV Codebook
This file contains district level means in grade equivalent units. There are multiple observations per district, one for each subject; values are averaged across years and grades. Stata Excel
This file contains district level means in grade equivalent units. There is one observations per district; values are averaged across years, grades and subjects. Stata Excel
This file contains district level means in constant population standard deviation units. There are multiple observations per district; one for each year, grade and subject. Stata Excel CSV
This file contains district level means in constant population standard deviation units. There are multiple observations per district; one for each subject; values are averaged across years and grades. Stata Excel
This file contains district level means in constant population standard deviation units. There is one observations per district; values are averaged across years, grades and subjects. Stata Excel
This file contains district level means in NAEP-referenced units. Estimates are comparable between states. There are multiple observations per district; one for each year, grade and subject. Stata Excel CSV
This file contains district level means in state-referenced units. Estimates are comparable within states. There are multiple observations per district; one for each year, grade and subject. Stata Excel CSV
This file contains district level white-black and white-Hispanic achievement gaps. There are multiple observations per district; one for each year, grade and subject. Stata Excel CSV
This file contains district level white-black and white-Hispanic achievement gaps. There are multiple observations per district; one for each subject; values are averaged across years and grades. Stata Excel
This file contains district level white-black and white-Hispanic achievement gaps. There is one observations per district; values are averaged across years, grades and subjects. Stata Excel
This file contains district level covariates (socioeconomic, demographic, school level data). There are multiple observations per district; one for each year and grade. Stata CSV Codebook
This file contains district level covariates (socioeconomic, demographic, school level data). There are multiple observations per district; one for each year. Stata CSV
This file contains district level covariates (socioeconomic, demographic, school level data). There is one observation per district. Stata Excel
This file contains a unique school identifier, an identifier indicating its NCES ID (the district to which it legally belongs), and the district in which it is included in our estimates. There is one observation per school. Stata CSV Documentation
Version 1.1

Technical Documentation

Assessment Outcomes: Means and Standard Errors 
Data Description Disaggregated by Download Documentation
File Title Description Metric Geographic Level Year Grade Subject      
MeanA_V1.1 This file contains district level means in grade equivalent units. There are multiple observations per district; one for each year, grade and subject. Grade Equivilant Units District x x x Stata CSV Codebook
MeanB_V1.1 This file contains district level means in grade equivalent units. There are multiple observations per district, one for each subject; values are averaged across years and grades. Grade Equivilant Units  District     x Stata CSV
MeanC_V1.1 This file contains district level means in grade equivalent units. There is one observations per district; values are averaged across years, grades and subjects. Grade Equivilant Units District       Stata CSV
MeanD_V1.1 This file contains district level means in constant population standard deviation units. There are multiple observations per district; one for each year, grade and subject. Standard Deviation Units District x x x Stata CSV
MeanE_V1.1 This file contains district level means in constant population standard deviation units. There are multiple observations per district; one for each subject; values are averaged across years and grades. Standard Deviation Units District     x Stata CSV
MeanF_V1.1 This file contains district level means in constant population standard deviation units. There is one observations per district; values are averaged across years, grades and subjects. Standard Deviation Units District       Stata CSV
MeanG_V1.1 This file contains district level means in NAEP-referenced units. Estimates are comparable between states. There are multiple observations per district; one for each year, grade and subject. NAEP  District x x x Stata CSV
MeanH_V1.1 This file contains district level means in state-referenced units. Estimates are comparable within states. There are multiple observations per district; one for each year, grade and subject. State Referenced District x x x Stata CSV
Assessment Outcomes: Achievement Gaps
Data Description  Disaggregated by Dowload  Documentation
File Title Description Metric Geographic Level Year Grade Subject      
GapA_V1.1 This file contains district level white-black and white-Hispanic achievement gaps. There are multiple observations per district; one for each year, grade and subject. Standard Deviation Units District x x x Stata CSV Codebook
GapB_V1.1 This file contains district level white-black and white-Hispanic achievement gaps. There are multiple observations per district; one for each subject; values are averaged across years and grades. Standard Deviation Units District     x Stata CSV
GapC_V1.1 This file contains district level white-black and white-Hispanic achievement gaps. There is one observations per district; values are averaged across years, grades and subjects. Standard Deviation Units District       Stata CSV
Covariates 
Data Description  Disaggregated by Dowload  Documentation
File Title Description Metric  Geographic Level Year Grade Subject      
CovA_V1.1 This file contains district level covariates (socioeconomic, demographic, school level data). There are multiple observations per district; one for each year and grade. - District x x   Stata CSV Codebook
CovB_V1.1 This file contains district level covariates (socioeconomic, demographic, school level data). There are multiple observations per district; one for each year. - District x     Stata CSV
CovC_V1.1 This file contains district level covariates (socioeconomic, demographic, school level data). There is one observation per district. - District       Stata CSV
Ancillary Files
Data Description Disaggregated by Dowload  Documentation
File Title Description Metric Geographic Level Year Grade Subject      
AncillaryA_V1.1 This file contains a unique school identifier, an identifier indicating its NCES ID (the district to which it legally belongs), and the district in which it is included in our estimates. There is one observation per school. - District       Stata CSV  
AncillaryB_V1.1 This file contains the shape file that corresponds to the district crosswalk.  - National       File    
Version 2.0

This page contains four sets of files:

  1. Technical Documentation and Codebooks
  2. Test Score Estimates: Means, Standard Deviations, and Achievement Gaps
  3. Covariate Data Files
  4. Ancillary Data Files

Errata: The first release of SEDA 2.0 had an error in the pooled gap estimates. The current files incorporate the fix. See the technical documentation for details.

Technical Documentation and Codebooks
File Name Download
SEDA_documentation_v20 PDF
SEDA_codebook_geodist_v20 Excel
SEDA_codebook_county_v20 Excel
SEDA_codebook_cov_geodist_v20 Excel
SEDA_codebook_crosswalk_v20 Excel

Test Score Estimates: Means, Standard Deviations, and Achievement Gaps
File Name Form Metric Disaggregated by Download
Geographic District County Year Grade Subject Group
All Race Race Gaps
SEDA_geodist_long_CS_v20 Long CS X   X X X X X X Stata CSV
SEDA_geodist_long_GCS_v20 Long GCS X   X X X X X X Stata CSV
SEDA_geodist_long_NAEP_v20 Long NAEP X   X X X X X X Stata CSV
SEDA_geodist_long_State_v20 Long State X   X X X X X X Stata CSV
SEDA_geodist_poolsub_CS_v20 Pooled CS X       X X X X Stata CSV
SEDA_geodist_poolsub_GCS_v20 Pooled GCS X       X X X X Stata CSV
SEDA_geodist_pool_CS_v20 Pooled CS X         X X X Stata CSV
SEDA_geodist_pool_GCS_v20 Pooled GCS X         X X X Stata CSV
Download All SEDA V2.0 Geographic District-Level Files Stata CSV
SEDA_county_long_CS_v20 Long CS   X X X X X X X Stata CSV
SEDA_county_long_GCS_v20 Long GCS   X X X X X X X Stata CSV
SEDA_county_long_NAEP_v20 Long NAEP   X X X X X X X Stata CSV
SEDA_county_long_State_v20 Long State   X X X X X X X Stata CSV
SEDA_county_poolsub_CS_v20 Pooled CS   X     X X X X Stata CSV
SEDA_county_poolsub_GCS_v20 Pooled GCS   X     X X X X Stata CSV
SEDA_county_pool_CS_v20 Pooled CS   X       X X X Stata CSV
SEDA_county_pool_GCS_v20 Pooled GCS   X       X X X Stata CSV
Download All SEDA V2.0 County-Level Files Stata CSV

Metric: CS = Cohort Scale; GCS = Grade Scale; NAEP = NAEP Scale; State = State-referenced Scale
Academic Years: 2008/09 – 2014/15
Grades: 3 – 8
Subjects: Math, ELA
Race: white, black, Hispanic, and Asian
Race Gaps: white-black, white-Hispanic, white-Asian

Covariate Data
File Name Form Disaggregated by Download
District Year Grade
SEDA_cov_geodist_long_v20 Long X X X Stata CSV
SEDA_cov_geodist_poolyr_v20 Pooled X X   Stata CSV
SEDA_cov_geodist_pool_v20 Pooled X     Stata CSV

Ancillary Data
File Name Disaggregated by Download
School District Year
SEDA_crosswalk_v20 X   X Stata CSV
SEDA_shapefiles_v20   X   Zip
Version 2.1

This page contains five sets of files:

  1. Technical Documentation and Codebooks
  2. Data Files Used in News Articles
  3. Test Score Estimates: Means, Standard Deviations, and Achievement Gaps
  4. Covariate Data Files
  5. Ancillary Data Files

Technical Documentation and Codebooks
File Name Download
SEDA_documentation_v21 PDF
SEDA_codebook_geodist_v21 Excel
SEDA_codebook_county_v21 Excel
SEDA_codebook_cov_geodist_v21 Excel
SEDA_codebook_crosswalk_v21 Excel

Data Files Used in News Articles
Article and Date Paper Data
Where Boys Outperform Girls in Math:
Rich, White and Suburban Districts
. New York Times. 6/13/2018
Gender Achievement Gaps in U.S. School Districts. Reardon, S.F., Fahle, E.M., Kalogrides, D., Podolsky, A., & Zárate, R.C. (2018) Excel* Data
*Notes about the data are included in the Excel spreadsheet.

Test Score Estimates: Means, Standard Deviations, and Achievement Gaps
File Name Form Metric Disaggregated by Download
Unit Year Grade Subject Means & SDs Gaps
All Race Gender Race Gender
SEDA_geodist_long_CS_v21 Long CS Geographic District X X X X X X X X Stata CSV
SEDA_geodist_long_GCS_v21 Long GCS Geographic District X X X X X X X X Stata CSV
SEDA_geodist_long_NAEP_v21 Long NAEP Geographic District X X X X X X X X Stata CSV
SEDA_geodist_long_State_v21 Long State Geographic District X X X X X X X X Stata CSV
SEDA_geodist_poolsub_CS_v21 Pooled CS Geographic District     X X X X X X Stata CSV
SEDA_geodist_poolsub_GCS_v21 Pooled GCS Geographic District     X X X X X X Stata CSV
SEDA_geodist_pool_CS_v21 Pooled CS Geographic District       X X X X X Stata CSV
SEDA_geodist_pool_GCS_v21 Pooled GCS Geographic District       X X X X X Stata CSV
Download All SEDA V2.1 Geographic District Files Stata CSV
SEDA_county_long_CS_v21 Long CS County X X X X X X X X Stata CSV
SEDA_county_long_GCS_v21 Long GCS County X X X X X X X X Stata CSV
SEDA_county_long_NAEP_v21 Long NAEP County X X X X X X X X Stata CSV
SEDA_county_long_State_v21 Long State County X X X X X X X X Stata CSV
SEDA_county_poolsub_CS_v21 Pooled CS County     X X X X X X Stata CSV
SEDA_county_poolsub_GCS_v21 Pooled GCS County     X X X X X X Stata CSV
SEDA_county_pool_CS_v21 Pooled CS County       X X X X X Stata CSV
SEDA_county_pool_GCS_v21 Pooled GCS County       X X X X X Stata CSV
Download All SEDA V2.1 County Files Stata CSV

Metric: CS = Cohort Scale; GCS = Grade Scale; NAEP = NAEP Scale; State = State-referenced Scale
Academic Years: 2008/09 – 2014/15
Grades: 3 – 8
Subjects: Math, ELA
Race: white, black, Hispanic, and Asian
Race Gaps: white-black, white-Hispanic, white-Asian
Gender: male, female
Gender Gaps: male-female

Covariate Data
File Name Form Disaggregated by Download
District Year Grade
SEDA_cov_geodist_long_v21 Long X X X Stata CSV
SEDA_cov_geodist_poolyr_v21 Pooled X X   Stata CSV
SEDA_cov_geodist_pool_v21 Pooled X     Stata CSV

Ancillary Data
File Name Disaggregated by Download
School District Year
SEDA_crosswalk_v21 X   X Stata CSV
SEDA_shapefiles_v20   X   Stata CSV