Several recent studies have concluded that residential segregation by income in the U.S. has increased in the decades since 1970, including a significant increase after 2000. Income segregation measures, however, are biased upwards when based on sample data. This is a potential concern because the sampling rate of the American Community Survey (ACS)—from which post-2000 income segregation estimates are constructed—was lower than that of the earlier decennial Censuses. This raises the possibility that the apparent increase in income segregation post-2000 simply reflects increased upward bias in the estimates from the ACS, and the estimated increase may therefore be inaccurate.
In this paper, we first derive formulas describing the approximate sampling bias in two measures of segregation. Next, using Monte Carlo simulations, we show that the bias-corrected estimators eliminate virtually all of the bias in segregation estimates in most cases of practical interest, although the correction fails to eliminate bias in some cases when the population is unevenly distributed among geographic units and the average within-unit samples are very small. We then use the bias-corrected estimators to produce unbiased estimates of the trends in income segregation over the last four decades in large U.S. metropolitan areas. Using these corrected estimates, we replicate the central analyses in four prior papers on income segregation. We find that the primary conclusions from these papers remain unchanged, although the true increase in income segregation among families after 2000 was only half as large as that reported in earlier work. Despite this revision, our replications confirm that income segregation has increased sharply among families with children in recent decades, and that income inequality is a strong and consistent predictor of income segregation.
Income Segregation Data:
These data include all estimates of income segregation-- both uncorrected and bias-corrected—used in the paper. Please see the codebook for detail.
Replication Code and Data:
The data used to set the parameters for the simulation analyses are included here. These data may be used with the Stata .do file: "Sampling bias in measures of income segregation simulation v11" (linked below). The data are from the ACS 2005-2009 5-year samples of family income. Running the simulation code with these data will replicate the primary simulation results of the paper. A codebook is also provided.
The Stata .do file "Sampling bias in measures of income segregation simulation v11" runs the Monte Carlo simulation described in the paper. Using the provided 2005-2009 ACS data (above) will replicate the primary simulation results of the paper. The user written commands seg.ado and rankseg.ado are also required and may be found on the SSC archive, or by typing "findit seg", "findit rankseg" in Stata.