A Comparison of Annual Earnings Data in the Current Population Survey and in the Social Security Administration's Detailed Earnings Record

by
Research and Statistics Note No. 2024-01 (released March 2024)

Patrick J. Purcell is with the Office of Research, Evaluation, and Statistics, Office of Retirement and Disability Policy, Social Security Administration.

Contents of this publication are not copyrighted; any items may be reprinted, but citation is requested. The findings and conclusions presented in this note are those of the author and do not necessarily represent the views of the Social Security Administration.

Introduction

Selected Abbreviations
CPS Current Population Survey
CPS/ASEC Current Population Survey Annual Social and Economic Supplement
DER Detailed Earnings Record
OLS ordinary least squares
SIPP Survey of Income and Program Participation
SSA Social Security Administration

For earnings data, primary sources include household surveys conducted by the Census Bureau and administrative data files maintained by the Social Security Administration (SSA). Under a cooperative agreement, self-reported earnings data from Census Bureau surveys are linked with the amounts reported for those workers by their employers and recorded in the SSA files (Genadek, Hokayem, and Pendergast 2021).1 This linkage allows researchers to assess the accuracy of the earnings and benefits amounts reported by workers in household surveys, which are subject to reporting error. Research conducted with linked data sets helps the Census Bureau to improve its surveys and SSA to administer its programs more efficiently.2

There are several possible sources of error when researchers use only survey data. First, individuals have become less willing to participate in surveys in recent years, and respondents have been less likely to answer certain questions (Meyer, Mok, and Sullivan 2015; Bollinger and others 2019). Rising nonresponse rates could bias research results that use survey data alone. Imputation errors can also occur when using only survey data. For example, when respondents decline to answer questions about their earnings, the Census Bureau uses statistical methods to impute the missing information by examining the responses of other participants with similar demographic traits who answered the relevant questions. Bollinger and Hirsch (2006) found that including imputed values of earnings in regressions relating earnings to other attributes resulted in biased coefficients. However, by using linked survey and administrative data, imputed values can be replaced with values from administrative records.

Using administrative data sets alone also has limitations. Administrative data that are specific to participants in a particular program or to a geographic area may not be nationally representative of the population. Inferences about people who are not enrolled in the program or who live in other locations cannot be drawn from data that are not nationally representative. Moreover, administrative data often include information relevant only to a particular government program or policy. For example, SSA does not collect information about beneficiaries' education levels because educational attainment does not affect its program qualifications. However, because education has a strong statistical relationship to lifetime earnings and other social and economic outcomes, the Census Bureau regularly collects educational attainment information in its household surveys. Using Census Bureau survey data linked to SSA records allows researchers to include educational attainment data when studying outcomes such as lifetime earnings and Social Security benefits. Linked data files provide more and better data for studying the relationship between demographic characteristics and economic outcomes than either survey data or administrative data alone can provide.

Household surveys linked to administrative records combine the strengths of both sources, while also reducing the limitations of each source separately. The Census Bureau collects important demographic and economic information in its household surveys, such as the Current Population Survey (CPS), but surveys are also subject to recall error and misreporting. Administrative data files, including SSA's Master Earnings File and Master Beneficiary Record, contain more accurate records of earnings and benefits. Linking survey files to administrative records combines surveys' rich demographic details with administrative records' greater accuracy on certain topics.

One way that researchers use linked data is to evaluate measurement error, nonresponse, and imputation effects on survey data accuracy. Davies and Fisher (2009) reported that further research is needed on the extent to which self-reported earnings in household surveys agree with earnings recorded in administrative records from SSA. They noted that comparing earnings in household surveys with earnings in administrative records could lead to improved methods for imputing missing data and more accurate analyses of proposed revisions to the Old-Age, Survivors, and Disability Insurance and Supplemental Security Income programs. Following their suggestion, this research and statistics note compares wage and salary earnings data in the CPS and in SSA's data files.

Previous Research

Several researchers have used either CPS or Survey of Income and Program Participation (SIPP) data linked to SSA data files to examine how reported earnings in the surveys compare with earnings in administrative records. In most of these studies, differences between survey and administrative data were assumed to represent survey measurement error. Following standard practice, this note compares earnings reported in the CPS with earnings recorded in SSA's Detailed Earnings Record (DER), and differences are assumed to represent survey measurement error. Nevertheless, readers are cautioned that all sources of data, including administrative records, are subject to error.

Several studies that compared self-reported earnings in surveys with earnings in SSA's records found that survey measurement error is not random and varies with observable worker characteristics. Bound and Krueger (1991) compared a sample of heads of households' earnings from the CPS with their earnings from SSA records over 2 years. They found that CPS measurement error is nonrandom and that for men, error is negatively correlated with actual earnings. Bollinger (1998) compared earnings in the March 1978 CPS with SSA data and found that low earners significantly overreported their earnings. He concluded that overreporting among low earners is largely responsible for survey measurement error. Roemer (2002) compared CPS and SIPP earnings data with SSA earnings records and found that the distribution of annual earnings differs between the CPS and the SIPP. The CPS Annual Social and Economic Supplement (CPS/ASEC) showed an excess of high earnings and a shortage of low earnings while the SIPP showed the opposite. He attributed the shortage of low earnings in the CPS mainly to underestimates of earnings among part-year or part-time workers.

To study the nonresponse and imputation effects in the CPS on earnings measures, Bollinger and others (2019) examined CPS data linked to SSA earnings records for 2006–2011. They found that earnings were more frequently missing among lower earners and higher earners than among those with earnings closer to the median. The authors noted that if nonresponse were random, researchers could use a respondent-only sample reweighted by the inverse probability of being in the respondent sample; however, this adjustment is not sufficient when nonresponse is nonrandom. They suggested that researchers use linked survey and administrative data and replace survey earnings with administrative earnings. They also concluded that even if nonresponse were random, earnings estimates could be biased if they include imputed values for missing data.

Pedace and Bates (2000), Gottschalk and Huynh (2005), and Cristia and Schwabish (2007) examined SIPP data linked to SSA earnings records and found that reporting error is negatively correlated with earnings level (as measured in the administrative data). All three studies found that respondents with lower earnings tend to overreport earnings and those with higher earnings tend to underreport earnings. Kim and Tamborini (2012) found that SIPP respondents' misreports of earnings are nonrandom and that low earners tend to overreport earnings and high earners tend to underreport earnings, confirming earlier studies. They also found that low-earning Black workers overreport earnings more than low-earning White workers. Kim and Tamborini (2014) also used linked SIPP and SSA data and found that at higher earnings levels, higher-educated workers were less likely to underreport their earnings than were less-educated workers, but at lower earnings levels, higher-educated workers were more likely to overreport their earnings than were less-educated workers. Abowd and Stinson (2013) found that earnings data in SIPP and administrative records from SSA were similar except for imputed earnings in the SIPP, where the authors found greater survey measurement error.

Data and Methods

The data analyzed for this note consist of records from the Census Bureau's CPS/ASEC linked to SSA's DER. The CPS has been conducted since 1948 and is extensively documented on the Census Bureau's and the Bureau of Labor Statistics' websites. The CPS/ASEC is conducted annually in March, and the Census Bureau publishes detailed technical documentation for the supplement every year (Census Bureau 2022). Rothbaum and Berchick (2019) described recent changes to the CPS/ASEC income questions, including the addition of follow-up questions that allow respondents to report income in ranges rather than as specific amounts. Administrative files from SSA are available only on a restricted-use basis and are not documented as extensively as are Census Bureau public-use data files. Olsen and Hudson (2009) described SSA's earnings data, explained how the data are collected and stored, and identified some of the limitations and complexities of using the data for research purposes.

This analysis compares annual wage and salary earnings data reported by workers aged 18–69 in the CPS/ASEC with those recorded for the same workers in the DER.3 Wage and salary earnings in the DER are derived from the Form W-2 that employers submit to SSA.4

Specifically, this note compares the CPS variable WSAL_VAL (total wage and salary earnings) with a DER variable derived from Box 5 of Form W-2.5 Census Bureau (2022) defines wages and salaries as “total money earnings received for work performed as an employee during the income year. It includes wages, salary, Armed Forces pay, commissions, tips, piece-rate payments, and cash bonuses earned, before deductions are made for taxes, bonds, pensions, union dues, etc.” Box 5 of Form W-2 consists of all wages, salaries, and tips subject to the Medicare payroll tax. This includes amounts deferred into 401(k) plans, which are excluded from income tax but are subject to Medicare payroll tax.6

The DER data in this study are linked to nine CPS/ASEC files for selected years from 2005 to 2021, the most recent linked file available when this note was written. This analysis does not include self-employment income, which workers report directly to the Internal Revenue Service on Form 1040, because it is subject to different reporting requirements and different potential sources of error than wage and salary income. In this note, the term “earnings” refers to wage and salary income.

In the sections to follow, 10 charts, with accompanying discussion, present descriptive statistics on CPS/ASEC respondents and on the differences in the earnings data between the CPS and the DER by selected worker characteristics.

The first three charts show key characteristics of the March CPS/ASEC sample for 2005, 2006, 2010, 2011, 2015, 2016, and 2019 through 2022.7 These charts represent the entire CPS public-use file each year, rather than just the subset of each file linked to SSA records. Chart 1 shows the total number of households in the CPS sample and the number and percentage of sample households that were interviewed. Chart 2 shows the proportion of households that were not interviewed because the occupants could not be contacted or they declined to participate in the survey. Chart 3 shows the annual numbers and percentages of CPS respondents whose earnings were imputed by the Census Bureau because they did not answer the relevant survey questions.

Charts 4–10 show the percentage difference between CPS results and DER records on earnings for selected CPS files from March 2005 through March 2021. The percentage difference between earnings data in the CPS and in the DER—defined here as (CPS − DER) ÷ DER—was calculated for each respondent who had both CPS and DER earnings recorded.8 The percentage differences were then sorted from the largest positive difference to the largest negative difference. Each chart shows three measures of percentage difference between CPS and DER earnings data: the median difference and the 75th and 25th percentile differences (that is, the interquartile range). This analysis focuses on the percentage differences between the CPS and DER earnings data for each respondent. Therefore, the relevant median difference is the median of all respondents' CPS and DER earnings differences. This is not the same as the difference in medians between CPS and DER earnings. This distinction is important for interpreting the results of the analysis correctly.

Finding the median (or other percentile) percentage difference between CPS and DER earnings data involves sorting the differences for each person from the largest positive difference (where the CPS earnings value is higher than the DER earnings value) to the largest negative difference (where the CPS earnings value is lower than the DER earnings value). A positive difference in CPS earnings − DER earnings indicates higher earnings recorded in the CPS than in the DER. A negative difference in CPS earnings − DER earnings indicates lower earnings recorded in the CPS than in the DER. For convenience, I refer to positive differences as overreporting of earnings in the CPS and negative differences as underreporting in the CPS, regardless of whether the CPS earnings data were reported or imputed.

Chart 4 shows the differences between CPS and DER earnings data for all individuals with earnings recorded in the CPS by earnings imputation status, that is, both self-reported and imputed earnings, only self-reported earnings, and only imputed earnings. Charts 5–10 show the percentage difference between CPS and DER earnings data categorized by workers' DER earnings quartile, age, sex, racial or ethnic group, education level, and type of hours worked. These charts cover workers with earnings data in both the CPS and the DER, regardless of whether CPS earnings were reported or imputed. The charts are followed by a section that describes the results of an ordinary least squares (OLS) regression in which the characteristics shown in the charts—imputation status, DER earnings quartile, age, sex, racial or ethnic group, education level, and type of hours worked and a variable indicating coverage by employer-provided health insurance—are regressed on the difference between CPS and DER earnings data.

Trends in Completed CPS Interviews and Imputation of Earnings

For each year of CPS data shown in Chart 1, the initial sample size was between 89,000 and 100,000 units. More than 99 percent of these units were households. The remainder were group quarters, such as college dormitories. Over time, both the number and percentage of units for which CPS interviews were successfully completed declined. The March 2005 CPS/ASEC sample consisted of 99,699 units, of which 77,482 (77.7 percent) were interviewed. The March 2006 CPS/ASEC sample consisted of 97,461 units, of which 76,048 (78.0 percent) were interviewed. For the March 2020, 2021, and 2022 surveys, an average of 90,485 units were in the sample and an average of 60,819 (67.2 percent) were interviewed. The percentage of sample units interviewed in these years may have been reduced by the COVID-19 pandemic, but the decline in the percentage of sample units interviewed began before the 2020 onset of the pandemic.

Line chart with tabular version below.
Show as table
Table equivalent for Chart 1. Number of units in CPS sample and number and percentage of sample units interviewed
Year Units in sample Units interviewed Percentage of sample units interviewed
2005 99,699 77,482 77.7
2006 97,461 76,048 78.0
2010 97,263 76,260 78.4
2011 96,958 75,188 77.5
2015 99,461 74,257 74.7
2016 94,097 69,484 73.8
2019 94,589 68,301 72.2
2020 91,500 60,460 66.1
2021 90,759 62,850 69.2
2022 89,197 59,148 66.3
SOURCE: CPS/ASEC.

The decline in the proportion of CPS sample units interviewed was the result of a rising proportion of households selected for the sample that chose not to participate. The Census Bureau classifies units that are not interviewed as Type A, Type B, or Type C noninterviews.9 Type A noninterviews consist of households where the residents refused to participate, no one was home after repeated attempts to contact them, the occupants were temporarily absent, a language barrier prevented the interview, the sample unit could not be located, or the occupants could not be interviewed for some other reason. Type B noninterviews are the result of an unoccupied housing unit. Type C noninterviews consist of housing units that were abandoned or demolished. In recent years, both the number of Type A noninterviews and the percentage of noninterviews that were Type A have risen. In March 2005, there were 7,485 Type A noninterviews, which constituted 33.7 percent of all noninterviews. In March 2022, there were 19,046 Type A noninterviews, which constituted 63.4 percent of all noninterviews (Chart 2).

Line chart with tabular version below.
Show as table
Table equivalent for Chart 2. Number of Type A noninterviews and Type A as a percentage of all noninterviews
Year Number As a percentage of all noninterviews
2005 7,485 33.7
2006 7,070 33.0
2010 5,678 27.0
2011 6,549 30.1
2015 10,271 40.8
2016 10,590 43.0
2019 13,511 51.4
2020 18,981 61.2
2021 16,455 59.0
2022 19,046 63.4
SOURCE: CPS/ASEC.

When a household declines to participate in the CPS, it is called a “unit nonresponse.” If a household participates in the survey but declines to answer a certain question, it is called an “item nonresponse.” Survey participants sometimes decline answering questions about sources and amounts of income. When that occurs, the Census Bureau imputes a response through statistical procedures that match the nonrespondent to a respondent with similar characteristics that are correlated with receipt of the specific type of income in question.

CPS public-use files include variables that indicate whether the amount of income from a particular source on an individual's record was imputed. Chart 3 shows the number and percentage of survey respondents whose earnings amounts were imputed. The Census Bureau imputed earnings for 19.7 percent of participants who had any wage and salary income in the March 2005 CPS and for 17.7 percent in the March 2006 CPS. For the March 2021 and March 2022 surveys, the proportions were 21.4 percent and 21.9 percent, respectively. In recent years, the proportion of interviews for which the Census Bureau has imputed wage and salary income has been relatively stable at about 21 percent to 23 percent.

Bar chart and line chart overlay with tabular version below.
Show as table
Table equivalent for Chart 3. Number of CPS respondents aged 18–69 with reported and imputed earnings and percentage with imputed earnings
Year Number Percentage imputed
Respondent reported Census imputed
2005 77,698 19,055 19.7
2006 79,361 17,020 17.7
2010 77,505 17,480 18.4
2011 74,703 17,324 18.8
2015 68,445 20,872 23.4
2016 64,048 19,177 23.0
2019 63,358 18,397 22.5
2020 55,864 16,248 22.5
2021 57,593 15,659 21.4
2022 53,071 14,839 21.9
SOURCE: CPS/ASEC.

Comparison of CPS and DER Earnings Information

Chart 4 shows the percentage difference between CPS and DER earnings information for all workers aged 18–69 who had such information in both sources. In Panel A, the CPS component of the study sample includes all respondents regardless of whether their earnings information was self-reported or imputed by the Census Bureau. The median percentage difference falls into a narrow range from −0.6 percent to 0.0 percent and averages −0.3 percent. The 25th percentile difference between the CPS and DER amounts averages −16.1 percent and the 75th percentile difference averages 23.6 percent, for an average interquartile range of 39.7 percentage points. From 2005 through 2021, the interquartile range grew wider. In the March 2005 CPS, the 25th percentile difference between the CPS and DER data was −12.4 percent and the 75th percentile difference was 19.2 percent, an interquartile range of 31.6 percentage points. In March 2021, the 25th percentile difference was −18.3 percent and the 75th percentile difference was 25.7 percent, an interquartile range of 44.0 percentage points. Thus, the middle 50 percent of observations in March 2021 represented a wider range of difference between CPS and DER earnings data than in March 2005. Both underreporting and overreporting of earnings in the CPS increased by about 6 percentage points over this period.

Three panels of line charts with tabular version below.
Show as table
Table equivalent for Chart 4. Percentage difference between CPS and DER earnings data at selected percentiles, by CPS earnings imputation status
Year Percentile
75th Median 25th
  Panel A: Respondent-reported and imputed CPS earnings
2005 19.2 -0.1 -12.4
2006 21.4 0.0 -13.8
2010 19.9 -0.4 -14.4
2011 21.2 -0.3 -15.2
2015 26.1 -0.2 -17.8
2016 25.9 -0.6 -18.4
2019 27.2 0.0 -16.7
2020 25.6 -0.2 -17.6
2021 25.7 -0.6 -18.3
  Panel B: Respondent-reported CPS earnings
2005 14.1 -0.2 -10.4
2006 15.1 -0.1 -11.0
2010 14.1 -0.4 -11.5
2011 15.0 -0.3 -12.1
2015 17.3 -0.3 -13.6
2016 17.3 -0.7 -14.5
2019 19.4 0.0 -13.6
2020 17.7 -0.3 -14.3
2021 17.5 -0.8 -15.5
  Panel C: Imputed CPS earnings
2005 84.0 6.0 -36.6
2006 76.1 3.3 -37.2
2010 74.6 2.5 -38.0
2011 69.5 2.0 -38.6
2015 75.7 2.5 -38.8
2016 75.0 1.2 -38.4
2019 80.6 4.6 -37.2
2020 80.4 4.1 -38.6
2021 87.5 5.2 -40.1
SOURCE: Author's calculations based on CPS/ASEC and DER.

By Imputation Status

Panel B of Chart 4 shows the percentage difference between CPS and DER earnings data for workers aged 18–69 with CPS data restricted to respondent-reported earnings. The restricted sample represents an average of 81.4 percent of CPS respondents aged 18–69 with earnings data in both the CPS and DER.

The median percentage difference ranges from −0.8 percent to 0.0 percent and averages −0.4 percent. The 25th percentile difference between the CPS and DER earnings data averages −12.9 percent and the 75th percentile difference averages 16.4 percent, an average interquartile range of 29.3 percentage points. As in Panel A, the interquartile range of difference between CPS and DER earnings data increased over time, but throughout the period the interquartile range of difference between CPS and DER earnings data is narrower in Panel B than in Panel A. In Panel B, for the March 2005 CPS, the 25th percentile difference between CPS and DER earnings data was −10.4 percent and the 75th percentile difference was 14.1 percent, an interquartile range of 24.5 percentage points. In March 2021, the 25th percentile difference was −15.5 percent and the 75th percentile difference was 17.5 percent, an interquartile range of 33.0 percentage points.

Panel C of Chart 4 shows the percentage difference between CPS and DER earnings data for workers aged 18–69 with CPS data restricted to respondents whose earnings amounts were imputed by the Census Bureau. For the years shown in Panel C, this restricted sample represents an average of 18.6 percent of workers aged 18–69. In Panel C, the median difference between CPS and DER earnings data is greater, and the interquartile range is much larger, than in Panels A and B (note the differing vertical axis scales).

In Panel C, the median difference ranges from 1.2 percent to 6.0 percent and averages 3.5 percent. The interquartile range of difference is larger for workers with imputed CPS earnings than for those who self-reported their earnings. The 25th percentile difference between CPS and DER earnings data averages −38.2 percent and the 75th percentile difference averages 78.2 percent, for an average interquartile range of 116.4 percentage points. Moreover, although the interquartile range in Panel B is almost symmetrical, the interquartile range in Panel C is asymmetrical. In Panel B, the 25th percentile difference between CPS and DER earnings in March 2021 is −15.5 percent and the 75th percentile difference is 17.5 percent. Both differ from 0 percent by almost the same amount. In Panel C, the 25th percentile difference between CPS and DER earnings in March 2021 is −40.1 percent but the 75th percentile difference is 87.5 percent. When earnings data are imputed in the CPS, overestimates of earnings are larger than underestimates. Why imputed CPS earnings differ from reported earnings in this asymmetrical way is a potential area for future research.

By DER Earnings Quartile

Chart 5 shows the difference between CPS and DER earnings data among workers aged 18–69 in each of the four DER earnings quartiles. For this chart, and all charts that follow, the CPS component of the study sample includes all respondents, regardless of whether their earnings information was self-reported or imputed by the Census Bureau.

Four panels of line charts with tabular version below.
Show as table
Table equivalent for Chart 5. Percentage difference between CPS and DER earnings data at selected percentiles, by DER earnings quartile
Year Percentile
75th Median 25th
  Panel A: Fourth DER earnings quartile
2005 3.0 -3.5 -16.8
2006 3.1 -4.1 -21.3
2010 2.6 -4.8 -22.7
2011 2.4 -5.5 -25.0
2015 2.3 -7.2 -32.5
2016 1.9 -7.8 -32.5
2019 3.2 -6.8 -29.3
2020 2.6 -7.7 -31.7
2021 1.9 -7.9 -32.1
  Panel B: Third DER earnings quartile
2005 7.7 -1.1 -11.5
2006 8.9 -1.0 -13.1
2010 8.5 -1.2 -13.1
2011 9.0 -1.0 -13.1
2015 10.9 -1.4 -16.1
2016 10.3 -1.8 -17.2
2019 11.0 -1.3 -15.9
2020 9.9 -1.4 -15.9
2021 8.7 -2.3 -17.0
  Panel C: Second DER earnings quartile
2005 23.9 0.6 -9.9
2006 24.7 1.3 -11.1
2010 25.7 0.9 -10.8
2011 27.9 1.3 -11.3
2015 31.8 2.1 -12.7
2016 31.3 1.6 -13.2
2019 29.2 2.2 -12.5
2020 27.3 1.2 -13.7
2021 27.1 0.9 -14.5
  Panel D: First DER earnings quartile
2005 135.8 15.5 -9.7
2006 145.4 20.2 -8.9
2010 130.8 14.0 -10.4
2011 133.7 16.9 -10.6
2015 164.3 26.7 -9.3
2016 162.2 26.7 -9.8
2019 162.8 32.3 -6.6
2020 162.5 30.4 -7.9
2021 184.2 33.2 -7.8
SOURCE: Author's calculations based on CPS/ASEC and DER.

The difference between CPS and DER earnings data varies substantially across DER earnings quartiles. In general, workers in the top two quartiles of DER earnings underreport earnings and those in the bottom two quartiles overreport earnings. Workers in the fourth (highest) quartile underreport earnings by a larger percentage on average than those in the third quartile, and those in the first (lowest) quartile overreport earnings by a larger percentage on average than those in the second quartile.

Panel A of Chart 5 shows the difference between CPS and DER earnings data among workers in the fourth quartile of DER earnings:

Panel B of Chart 5 shows the difference between CPS and DER earnings data among workers in the third quartile of DER earnings:

Panel C of Chart 5 shows the difference between CPS and DER earnings data among workers in the second quartile of DER earnings:

Panel D of Chart 5 shows the difference between CPS and DER earnings data among workers in the first quartile of DER earnings:

Several factors could contribute to the large percentage difference between the CPS and DER data for earners in the first quartile. First, because earnings are relatively low among these workers, a relatively small dollar difference between CPS and DER data can appear large when expressed as a percentage. Also, earnings are imputed for a larger proportion of workers in the lowest quartile. CPS earnings data were imputed for an average of 17.2 percent of workers in the fourth quartile, 17.4 percent in the third quartile, 19.2 percent in the second quartile, and 20.6 percent in the first quartile (not shown). Third, low earners could overreport earnings because of the social stigma associated with low earnings. For example, workers who experience a year of below-average earnings might report a higher amount if they believe it is more representative of their typical annual earnings. Fourth, lower-earning CPS respondents could report cash earnings not captured on Form W-2, from which earnings data in the DER are derived. Finally, low-earning workers might differ from high-earning workers systematically in other unidentified characteristics.

By Age

Chart 6 shows the percentage differences between CPS and DER earnings data for workers in five age groups: 18–29, 30–39, 40–49, 50–59, and 60–69.

Five panels of line charts with tabular version below.
Show as table
Table equivalent for Chart 6. Percentage difference between CPS and DER earnings data at selected percentiles, by worker age
Year Percentile
75th Median 25th
  Panel A: Aged 18–29
2005 38.6 1.7 -13.1
2006 41.6 2.3 -15.3
2010 40.6 1.3 -13.9
2011 44.3 2.2 -15.2
2015 53.2 2.6 -17.6
2016 50.5 1.7 -18.5
2019 50.7 3.2 -17.4
2020 48.5 2.3 -17.8
2021 53.3 1.9 -17.8
  Panel B: Aged 30–39
2005 19.3 0.0 -10.5
2006 21.4 0.2 -11.4
2010 18.4 0.0 -11.9
2011 18.4 0.0 -12.7
2015 25.4 0.4 -14.1
2016 25.0 0.0 -14.5
2019 25.2 0.9 -13.5
2020 22.3 0.0 -15.8
2021 21.9 -0.3 -16.2
  Panel C: Aged 40–49
2005 12.5 -1.0 -12.3
2006 14.6 -0.7 -14.0
2010 14.6 -0.9 -14.5
2011 15.1 -1.0 -15.8
2015 18.0 -1.2 -18.6
2016 17.3 -1.7 -20.3
2019 19.2 -0.7 -17.6
2020 18.6 -1.0 -18.8
2021 16.9 -1.7 -18.7
  Panel D: Aged 50–59
2005 10.2 -1.4 -13.6
2006 12.8 -1.0 -14.3
2010 11.9 -1.7 -15.9
2011 13.5 -1.4 -16.7
2015 15.9 -1.9 -20.2
2016 16.0 -1.9 -20.1
2019 18.2 -1.3 -18.5
2020 18.9 -1.3 -19.2
2021 18.1 -2.0 -20.8
  Panel E: Aged 60–69
2005 17.3 -0.4 -13.7
2006 16.6 -0.7 -14.3
2010 14.6 -1.2 -16.6
2011 17.2 -0.9 -15.5
2015 17.2 -0.9 -18.5
2016 19.6 -1.2 -18.8
2019 20.4 -0.7 -17.5
2020 20.4 -0.4 -17.0
2021 20.4 -1.5 -18.5
SOURCE: Author's calculations based on CPS/ASEC and DER.

By Sex

Chart 7 shows the percentage difference between CPS and DER earnings data for men and women. Among men aged 18–69, the median percentage difference between CPS and DER earnings data ranges from −0.5 percent to 0.0 percent and averages −0.2 percent (Panel A). Among women aged 18–69, the median percentage difference between CPS and DER earnings data ranges from −0.8 percent to 0.0 percent and averages −0.4 percent (Panel B).

Two panels of line charts with tabular version below.
Show as table
Table equivalent for Chart 7. Percentage difference between CPS and DER earnings data at selected percentiles, by sex
Year Percentile
75th Median 25th
  Panel A: Men
2005 21.4 0.0 -12.1
2006 23.3 0.0 -13.9
2010 22.6 0.0 -14.5
2011 22.4 -0.2 -15.4
2015 26.3 -0.2 -18.7
2016 26.7 -0.5 -19.2
2019 26.1 0.0 -17.2
2020 24.2 -0.2 -18.3
2021 26.0 -0.5 -18.8
  Panel B: Women
2005 17.0 -0.4 -12.7
2006 19.4 -0.2 -13.7
2010 17.3 -0.8 -14.3
2011 20.3 -0.4 -15.0
2015 25.8 -0.3 -17.0
2016 25.2 -0.6 -17.7
2019 28.1 0.0 -16.2
2020 26.6 -0.1 -17.0
2021 25.4 -0.7 -18.0
SOURCE: Author's calculations based on CPS/ASEC and DER.

The interquartile ranges of difference were similar for men and women. Among men, the 25th percentile difference between the CPS and DER averages −16.4 percent and the 75th percentile difference averages 24.3 percent, an interquartile range of 40.7 percentage points. Among women, the 25th percentile difference averages −15.8 percent and the 75th percentile difference averages 22.8 percent, an interquartile range of 38.6 percentage points.

By Racial or Ethnic Group

Chart 8 shows the percentage difference between CPS and DER earnings data for workers in four racial or ethnic groups: non-Hispanic White, non-Hispanic Black, Hispanic, and Asian.10

Four panels of line charts with tabular version below.
Show as table
Table equivalent for Chart 8. Percentage difference between CPS and DER earnings data at selected percentiles, by racial or ethnic group
Year Percentile
75th Median 25th
  Panel A: Non-Hispanic White
2005 17.2 0.0 -10.5
2006 19.5 0.0 -11.9
2010 18.6 -0.1 -12.5
2011 19.6 -0.1 -13.2
2015 22.8 0.0 -15.6
2016 22.3 -0.3 -16.3
2019 24.2 0.1 -14.2
2020 22.7 0.0 -15.2
2021 23.6 -0.2 -15.7
  Panel B: Non-Hispanic Black
2005 29.9 -1.1 -20.2
2006 32.6 0.0 -19.1
2010 28.4 -0.4 -18.6
2011 33.5 -0.1 -19.7
2015 41.3 0.4 -22.3
2016 44.4 -0.1 -23.0
2019 41.8 1.2 -22.1
2020 41.6 0.3 -23.1
2021 39.6 -0.7 -23.8
  Panel C: Hispanic
2005 25.4 -1.1 -18.1
2006 23.6 -1.4 -20.6
2010 20.2 -2.2 -20.3
2011 22.8 -1.5 -22.0
2015 30.3 -1.9 -23.5
2016 31.6 -1.7 -23.8
2019 30.0 -1.6 -22.4
2020 29.6 -1.2 -22.1
2021 27.8 -2.1 -23.9
  Panel D: Asian
2005 19.8 -1.9 -21.1
2006 20.4 -1.8 -20.1
2010 20.3 -1.6 -20.5
2011 19.0 -2.4 -20.5
2015 22.7 -2.7 -23.7
2016 25.5 -2.8 -24.9
2019 23.5 -1.8 -20.6
2020 22.3 -2.4 -21.9
2021 16.9 -3.9 -25.3
SOURCE: Author's calculations based on CPS/ASEC and DER.

By Education Level

Chart 9 shows the difference between CPS and DER earnings data for workers at four levels of educational attainment: did not finish high school, received a high school diploma, attended college but did not earn a 4-year degree, and received a bachelor's degree or higher.

Four panels of line charts with tabular version below.
Show as table
Table equivalent for Chart 9. Percentage difference between CPS and DER earnings data at selected percentiles, by education level
Year Percentile
75th Median 25th
  Panel A: Less than high school diploma
2005 28.4 -1.5 -22.0
2006 30.0 -1.9 -24.4
2010 24.4 -2.0 -23.6
2011 24.1 -2.3 -24.7
2015 32.2 -2.4 -27.9
2016 32.5 -3.1 -29.7
2019 32.4 -3.5 -28.9
2020 34.0 -2.7 -27.7
2021 33.9 -3.2 -29.9
  Panel B: High school diploma
2005 19.6 -0.3 -13.7
2006 23.1 -0.1 -14.9
2010 22.4 -0.3 -15.8
2011 24.4 -0.3 -16.7
2015 31.2 -0.4 -20.6
2016 30.4 -0.4 -20.3
2019 31.1 0.0 -18.8
2020 31.9 0.0 -20.8
2021 34.0 -0.2 -21.1
  Panel C: Some college
2005 21.1 0.0 -11.3
2006 22.0 0.0 -12.4
2010 21.6 -0.2 -13.3
2011 24.2 0.0 -14.0
2015 28.7 0.0 -16.9
2016 29.9 -0.2 -17.8
2019 32.1 0.5 -15.8
2020 28.7 0.1 -16.3
2021 29.4 -0.3 -17.8
  Panel D: Bachelor's degree or higher
2005 14.6 -0.1 -10.5
2006 17.1 0.0 -11.7
2010 15.7 -0.3 -12.6
2011 15.8 -0.5 -13.5
2015 20.3 -0.1 -15.3
2016 19.4 -0.7 -16.2
2019 21.0 0.0 -14.6
2020 19.3 -0.4 -16.1
2021 17.9 -0.9 -16.3
SOURCE: Author's calculations based on CPS/ASEC and DER.

By Type of Hours Worked

Chart 10 shows the percentage difference between CPS and DER earnings data for workers who were employed full time and year-round and those who worked part-year or part time:

Two panels of line charts with tabular version below.
Show as table
Table equivalent for Chart 10. Percentage difference between CPS and DER earnings data at selected percentiles, by type of hours worked
Year Percentile
75th Median 25th
  Panel A: Year-round and full time
2005 15.1 -0.1 -10.4
2006 16.6 0.0 -11.8
2010 15.9 -0.3 -11.8
2011 16.6 -0.2 -12.6
2015 21.3 -0.2 -15.2
2016 21.8 -0.5 -15.5
2019 22.8 0.0 -14.6
2020 21.2 -0.1 -15.5
2021 18.5 -0.8 -15.7
  Panel B: Part-year or part time
2005 32.5 -0.1 -19.7
2006 37.1 0.0 -21.5
2010 31.9 -0.6 -22.5
2011 35.7 -0.5 -23.8
2015 42.3 -0.4 -27.3
2016 41.1 -0.8 -29.2
2019 45.0 0.0 -25.7
2020 42.7 -0.2 -26.1
2021 48.2 0.0 -28.4
SOURCE: Author's calculations based on CPS/ASEC and DER.

Multivariate Analysis

Charts 4 through 10 illustrate the differences between CPS and DER earnings data by imputation status, DER earnings quartile, age, sex, racial or ethnic group, education level, and type of hours worked. Observing these differences individually is instructive, but it does not tell us how each variable is associated with the difference between CPS and DER earnings data when the other variables are also considered. Regression models can test multiple variables simultaneously. Table 1 shows the results of several OLS regressions that test the relationship between CPS and DER earnings data for each of the variables listed above and an additional variable that indicates if a worker paid all or part of an employer-provided health insurance plan premium. Employees pay premiums for employer-provided health insurance with pre-tax earnings, and this amount is not included in the DER. Consequently, CPS respondents' earnings amounts exceed the amounts in the DER if they pay for employment-based health insurance with pre-tax earnings.

Table 1. OLS regressions for differences in CPS and DER earnings data: CPS respondents aged 18–69 with wage data in the CPS and DER
Characteristic CPS year
Average 2006 2011 2016 2019 2021
Number of respondents 70,153 80,937 76,573 67,984 65,983 59,289
Dependent mean 0.0604 0.0710 0.0569 0.0493 0.0690 0.0560
Adjusted R-squared . . . 0.173 0.174 0.194 0.185 0.195
  Weighted sample means
CPS earnings imputed 0.1913 0.1750 0.1792 0.2177 0.1983 0.1863
Age 41.1494 40.2166 41.1939 41.3989 41.3975 41.5400
Men 0.5096 0.5112 0.5073 0.5106 0.5092 0.5098
Race or ethnicity
Non-Hispanic White 0.6770 0.7231 0.7083 0.6647 0.6502 0.6386
Non-Hispanic Black 0.1163 0.1120 0.1096 0.1172 0.1209 0.1219
Hispanic 0.1340 0.1073 0.1181 0.1423 0.1483 0.1538
Asian 0.0492 0.0384 0.0435 0.0512 0.0552 0.0575
Other or mixed a 0.0236 0.0193 0.0205 0.0246 0.0254 0.0282
Education level
Less than high school diploma 0.0634 0.0863 0.0666 0.0623 0.0542 0.0478
High school diploma 0.2667 0.2968 0.2758 0.2591 0.2538 0.2479
Some college 0.3054 0.3152 0.3171 0.3117 0.2969 0.2862
Bachelor's degree or higher 0.3645 0.3017 0.3405 0.3669 0.3951 0.4182
Employed year-round and full time 0.7048 0.7073 0.6827 0.7134 0.7404 0.6802
Pre-tax health insurance 0.4652 0.4639 0.4600 0.4645 0.4689 0.4688
  Parameter estimates
Intercept -1.5336 -1.5386*** -1.5291*** -1.5700*** -1.5477*** -1.4829***
CPS earnings imputed 0.0078 0.0065 -0.0120* -0.0166** 0.0347*** 0.0264***
DER earnings quartile
Third 0.2582 0.2217*** 0.2543*** 0.2905*** 0.2545*** 0.2698***
Second 0.4981 0.4439*** 0.5022*** 0.5514*** 0.4795*** 0.5135***
First 1.2237 1.1475*** 1.1791*** 1.3038*** 1.2187*** 1.2695***
Age 0.0285 0.0334*** 0.0297*** 0.0248*** 0.0291*** 0.0255***
Age squared . . . -0.0004*** -0.0003*** -0.0003*** -0.0003*** -0.0003***
Men 0.1188 0.1306*** 0.1161*** 0.1179*** 0.1134*** 0.1163***
Race or ethnicity
Non-Hispanic Black -0.0516 -0.0411*** -0.0445*** -0.0484*** -0.0629*** -0.0610***
Hispanic -0.0552 -0.0542*** -0.0573*** -0.0356*** -0.0638*** -0.0651***
Asian -0.0350 -0.0226* 0.0296** -0.0313* -0.0223** -0.0692***
Other or mixed a -0.0301 -0.0354** -0.0053 0.0029 -0.0587*** -0.0543***
Education level
Less than high school diploma -0.1233 -0.1049*** -0.1076*** -0.1526*** -0.1340*** -0.1173***
Some college 0.0635 0.0667*** 0.0669*** 0.0563*** 0.0608*** 0.0667***
Bachelor's degree or higher 0.2192 0.2013*** 0.2022*** 0.2426*** 0.2205*** 0.2297***
Employed year-round and full time 0.4658 0.4343*** 0.4679*** 0.5047*** 0.4634*** 0.4587***
Pre-tax health insurance 0.0606 0.0608*** 0.0739*** 0.0887*** 0.0423*** 0.0371***
SOURCE: Author's calculations based on CPS/ASEC and DER.
NOTES: Dependent variable: log(CPS earnings) − log(DER earnings)
. . . = not applicable.
* = statistically significant at the 0.10 level; ** = statistically significant at the 0.05 level; *** = statistically significant at the 0.01 level.
a. Consists primarily of respondents identifying as multiracial or American Indian/Alaska Native.

Following the research of Kim and Tamborini (2012, 2014), the dependent variable in Table 1 is the difference between each person's earnings amounts in the CPS and the DER. CPS and DER earnings amounts have been logarithmically transformed. If we assume that the DER represents actual earnings, then log(CPS earnings) − log(DER earnings) estimates the measurement error in the CPS.

The OLS regressions use data from the 2006, 2011, 2016, 2019, and 2021 CPS/ASEC surveys linked to the DER. Each year's sample consists of workers aged 18–69 with earnings data in both the CPS and the DER, regardless of whether the CPS earnings were reported or imputed. The unweighted sample sizes range from 59,289 to 80,937 respondents with an average of 70,153 respondents per year. The mean of the dependent variable, log(CPS earnings) − log(DER earnings), for those 5 years ranges from 0.0493 to 0.0710 with an average of 0.0604. Thus, annual earnings amounts in the CPS exceed those in the DER by an average of 6 percent for a typical respondent in the 5 years tested in the OLS model. In the regression results, a positive coefficient indicates association with a larger percentage difference between the CPS and DER earnings data than the mean difference while a negative coefficient indicates a smaller percentage difference than the mean.

The difference between CPS and DER earnings data varies by whether CPS earnings were self-reported or imputed and by the quartile rank of workers' DER earnings. In the regression model, a dummy variable indicates whether the earnings values were imputed. If so, the variable has a value of 1; if earnings were self-reported, its value is 0. Three variables indicate whether a worker's DER earnings ranked in the third, second, or first quartile in the calendar year preceding the survey. The fourth (highest) earnings quartile is the omitted category.

In the 5 years tested in the model, the weighted proportion of workers aged 18–69 with earnings data in both the CPS and DER for whom the Census Bureau imputed wage and salary income ranged from 17.5 percent to 21.8 percent. The coefficient for the variable indicating that the Census Bureau imputed earnings is negative for the March 2011 and March 2016 CPS and positive for the March 2019 and March 2021 CPS. (It was positive but not statistically significant for the March 2006 CPS.) One reason the sign changed may be that the Census Bureau changed its data processing and imputation procedures during this period. CPS technical documentation notes that the “imputation system was updated to make use of income ranges provided by some non-respondents as well as to increase the number of characteristics used in the imputation models” (Census Bureau 2019). Although the imputation variable was statistically significant, the coefficient was not large. In the regression on the March 2021 CPS, the mean of log(CPS earnings) − log(DER earnings) was 0.0560 and the coefficient for the imputation variable was 0.0264. All else being equal, log(CPS earnings) − log(DER earnings) was 2.64 percent greater when earnings were imputed rather than reported.

Chart 5 shows the median differences between CPS and DER earnings data for workers in each DER earnings quartile. For the 9 years in the study period, the median differences average −6.1 percent in the fourth quartile, −1.4 percent in the third quartile, 1.3 percent in the second quartile, and 24.0 percent in the first quartile.11 In general, workers in the fourth quartile of DER earnings tend to underreport earnings in the CPS and those in the first quartile tend to overreport their earnings. In the OLS regression, the fourth quartile is omitted as the reference group. The coefficients for the three lower quartiles are positive, are statistically significant, and increase as quartile rank falls. Other things being equal, log(CPS earnings) − log(DER earnings) is negatively correlated with earnings as measured by DER earnings quartile rank.

Results for the other independent variables generally reflect the relationships seen in the charts. The coefficients for men and for year-round, full-time workers are positive and statistically significant. Age squared is negative and significant. Relative to non-Hispanic White workers, the coefficients for non-Hispanic Black, Hispanic, and Asian workers are negative and significant. Relative to high school graduates, the coefficient for workers without a high school diploma is negative and significant and the coefficients for workers with some college and college graduates are positive and significant.

The variable indicating that a worker pays all or part of the premium for employer-provided health insurance is statistically significant in each survey year examined and has an average coefficient of 0.06, indicating that it is associated with a slightly higher-than-average percentage difference between CPS and DER data on wages, all else being equal. In a regression run separately on workers in each quartile of DER earnings (not shown), this variable was statistically significant only for workers in the third and second quartiles. Most workers in the lowest quartile of earnings do not have employer-provided health insurance unless it is through a family member's employer. For workers in the highest quartile, health insurance premiums often represent a smaller percentage of earnings than they do for workers in the middle two quartiles.

Given that the coefficient of determination (adjusted R-squared) across the 5 CPS years ranges from 0.173 to 0.195, the regression model explains less than 20 percent of the variability observed in the percentage difference between CPS and DER earnings data (Table 1). Other factors not included in the model also affect the difference between earnings reported in the CPS and amounts recorded in the DER. Whether other worker characteristics that we can observe in the CPS or in other data sets can explain more of the difference between CPS and DER earnings data is a possible topic for further research.

Summary and Conclusion

This note has examined earnings data in the CPS and earnings amounts recorded for the same workers in the DER administrative data file. The results generally confirm the findings of earlier research that compared earnings data from household surveys with those in SSA's records. If we assume that SSA's records represent actual earnings, the results suggest that the misreporting of earnings in the CPS is not random. Misreporting in the CPS varies by imputation status, DER earnings quartile, age, sex, racial or ethnic group, education level, and type of hours worked.

Both underreporting and overreporting of earnings in the CPS increased over the period studied. From 2005 through 2021, the interquartile range of differences in earnings data between the two sources, representing the middle 50 percent of observations, grew wider. The 25th percentile difference between the CPS and DER increased from −12.4 percent to −18.3 percent and the 75th percentile difference increased from 19.2 percent to 25.7 percent. Thus, the interquartile range increased from 31.6 percentage points to 44.0 percentage points (Chart 4, Panel A).

The difference between CPS and DER earnings data varied substantially by DER earnings quartile. In general, workers in the highest quartile of DER earnings underreported earnings on the CPS, and those in the lowest quartile of DER earnings overreported earnings. Among workers in the fourth quartile of DER earnings, the median percentage difference between the CPS and DER averaged −6.1 percent (Chart 5, Panel A). Among workers in the first quartile of DER earnings, the median percentage difference between the CPS and DER averaged 24.0 percent (Chart 5, Panel D). Because higher-earning workers underreported earnings and lower-earning workers overreported earnings, researchers should be cautious about using CPS/ASEC public-use files to study the distribution of earnings and earnings inequality. Ideally, such research should use CPS files linked to SSA earnings records.

Appendix

Table A-1. Characteristics of CPS respondents aged 18–69 with wage data in the CPS and DER, unweighted
Characteristic CPS year
2005 2006 2010 2011 2015 2016 2019 2020 2021
Wage data in—
Both DER and CPS 58,604 80,937 78,356 76,573 73,066 67,984 65,983 57,930 59,289
DER only 4,595 6,913 7,195 7,545 7,481 6,867 6,577 5,710 6,265
CPS only 4,259 5,732 6,011 5,867 5,949 5,478 5,248 4,595 4,567
CPS wages imputed 8,475 13,165 13,120 13,242 16,139 14,787 13,029 11,634 11,228
CPS wages imputed (%) 14.5 16.3 16.7 17.3 22.1 21.8 19.7 20.1 18.9
DER earnings quartile
Fourth 14,451 19,995 19,227 19,307 18,304 16,799 16,279 14,443 14,666
Third 14,886 20,403 19,705 19,436 18,553 17,360 16,560 14,820 15,116
Second 14,743 20,340 19,883 19,100 18,412 17,145 16,771 14,540 14,906
First 14,524 20,199 19,541 18,730 17,797 16,680 16,373 14,127 14,601
Age
18–29 14,081 19,048 18,033 17,814 16,401 15,346 14,670 12,008 12,620
30–39 14,930 19,454 18,325 17,814 17,330 16,272 16,050 14,150 14,726
40–49 16,554 22,617 20,239 19,641 17,087 15,735 15,050 12,987 13,439
50–59 9,940 15,116 15,878 15,729 15,265 13,973 12,959 11,940 11,778
60–69 3,099 4,702 5,881 6,283 6,983 6,658 7,284 6,845 6,726
Sex
Men 29,460 40,735 39,183 38,202 36,712 34,288 33,223 29,244 30,004
Women 29,144 40,202 39,173 38,371 36,354 33,696 32,760 28,686 29,285
Race or ethnicity
Non-Hispanic White 41,910 56,931 53,650 51,969 47,286 43,322 41,917 36,974 37,061
Non-Hispanic Black 5,699 8,152 8,193 7,940 8,120 7,798 7,200 6,078 6,566
Hispanic 6,619 9,996 10,418 10,341 11,232 11,022 10,953 9,380 9,935
Asian 2,287 3,233 3,737 3,960 4,084 3,696 3,816 3,640 3,670
Other or mixed a 2,089 2,625 2,358 2,363 2,344 2,146 2,097 1,858 2,057
Education level
Less than high school diploma 5,559 7,354 5,903 5,566 5,002 4,638 3,943 3,147 3,087
High school diploma 17,545 24,032 22,068 21,017 19,333 17,697 16,891 14,320 14,994
Some college 18,631 25,490 24,758 24,034 22,700 21,142 19,709 17,063 16,967
Bachelor's degree or higher 16,869 24,061 25,627 25,956 26,031 24,507 25,440 23,400 24,241
Employment
Worked year-round and full time 40,125 57,048 53,131 52,348 52,308 48,856 49,050 43,209 40,694
Worked part-year and part time 18,479 23,889 25,225 24,225 20,758 19,128 16,933 14,721 18,595
SOURCE: Author's calculations based on CPS/ASEC and DER.
a. Consists primarily of respondents identifying as multiracial or American Indian/Alaska Native.

Notes

1 Survey respondents can opt out of the data linkage.

2 These restricted-use linked files are available only for research and analysis by individuals who have completed required training on the federal laws that protect the confidentiality of data provided by Census Bureau survey participants. Access to the linked files is available through secure computing facilities for research projects that have been approved by the Census Bureau.

3 Appendix Table A-1 shows the annual numbers of observations in the linked data sets with earnings data in the CPS but not in the DER and the number with earnings data in the DER but not in the CPS.

4 CPS respondents are asked the amount they earned before any deductions. Employees pay premiums for employer-provided health insurance with pre-tax earnings, and the premium is not included in the DER data. CPS earnings could be more than the amount in the DER for respondents with employer-provided health insurance.

5 Procedures the Census Bureau uses to prevent disclosure of confidential information may result in differences between the public-use CPS/ASEC and the DER. These procedures are unlikely to have affected the results of this analysis.

6 Not all W-2 forms are posted to the DER. Some are posted to the Earnings Suspense File because they fail to meet SSA match criteria. Some employers or payroll providers fail to submit W-2 forms timely, and some forms contain errors that require correction; however, a large majority of W-2 forms are submitted on time and with accurate information.

7 DER linkage was not yet available for the 2022 CPS public-use file.

8 The percentage difference between CPS earnings and DER earnings is the same whether CPS and DER earnings are both in current or constant dollars.

9 The Census Bureau assigns noninterview households a sample weight of zero and adjusts the sample weights of interview households accordingly.

10 In the race and ethnicity categories used by the Census Bureau, a person of Hispanic ethnicity may be of any race.

11 For the 5 years used in the OLS model, the median differences average −6.4 percent in the fourth quartile, −1.5 percent in the third quartile, 1.5 percent in the second quartile, and 25.9 percent in the first quartile.

References

Abowd, John M., and Martha H. Stinson. 2013. “Estimating Measurement Error in Annual Job Earnings: A Comparison of Survey and Administrative Data.” Review of Economics and Statistics 95(5): 1451–1467.

Bollinger, Christopher R. 1998. “Measurement Error in the Current Population Survey: A Nonparametric Look.” Journal of Labor Economics 16(3): 576–594.

Bollinger, Christopher R., and Barry T. Hirsch. 2006. “Match Bias from Earnings Imputation in the Current Population Survey: The Case of Imperfect Matching.” Journal of Labor Economics 24(3): 483–519.

Bollinger, Christopher R., Barry T. Hirsch, Charles M. Hokayem, and James P. Ziliak. 2019. “Trouble in the Tails? What We Know about Earnings Nonresponse 30 Years after Lillard, Smith, and Welch.” Journal of Political Economy 127(5): 2143–2185.

Bound, John, and Alan B. Krueger. 1991. “The Extent of Measurement Error in Longitudinal Earnings Data: Do Two Wrongs Make a Right?” Journal of Labor Economics 9(1): 1–24.

Census Bureau. 2019. Current Population Survey: March 2019 Annual Social and Economic Supplement (ASEC). Technical Documentation. https://www2.census.gov/programs-surveys/cps/techdocs/cpsmar19.pdf.

———. 2022. Current Population Survey: March 2022 Annual Social and Economic Supplement (ASEC). Technical Documentation. https://www2.census.gov/programs-surveys/cps/techdocs/cpsmar22.pdf.

Cristia, Julian, and Jonathan A. Schwabish. 2007. “Measurement Error in the SIPP: Evidence from Matched Administrative Records.” Working Paper No. 2007-03. Washington, DC: Congressional Budget Office. https://www.cbo.gov/sites/default/files/110th-congress-2007-2008/workingpaper/2007-03_0.pdf.

Davies, Paul S., and T. Lynn Fisher. 2009. “Measurement Issues Associated with Using Survey Data Matched with Administrative Data from the Social Security Administration.” Social Security Bulletin 69(2): 1–12. https://www.ssa.gov/policy/docs/ssb/v69n2/v69n2p1.html.

Genadek, Katie R., Charles Hokayem, and Philip Pendergast. 2021. “The Summary Earnings Record and Detailed Earnings Record Extracts.” Working Paper No. 2021-05. Washington DC: Census Bureau. https://www.census.gov/library/working-papers/2021/econ/earnings-record-extracts.html.

Gottschalk, Peter, and Minh Huynh. 2005. “Validation Study of Earnings Data in the SIPP—Do Older Workers Have Larger Measurement Error?” Working Paper No. 2005-07. Chestnut Hill, MA: Center for Retirement Research at Boston College. https://crr.bc.edu/wp-content/uploads/2005/05/wp_2005-071.pdf.

Kim, ChangHwan, and Christopher R. Tamborini. 2012. “Do Survey Data Estimate Earnings Inequality Correctly? Measurement Errors Among Black and White Male Workers.” Social Forces 90(4): 1157–1181.

———. 2014. “Response Error in Earnings: An Analysis of the Survey of Income and Program Participation Matched with Administrative Data.” Sociological Methods & Research 43(1): 39–72.

Meyer, Bruce D., Wallace K. C. Mok, and James X. Sullivan. 2015. “Household Surveys in Crisis.” Journal of Economic Perspectives 29(4): 199–226. https://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.29.4.199.

Olsen, Anya, and Russell Hudson. 2009. “Social Security Administration's Master Earnings File: Background Information.” Social Security Bulletin 69(3): 29–46. https://www.ssa.gov/policy/docs/ssb/v69n3/v69n3p29.html.

Pedace, Roberto, and Nancy Bates. 2000. “Using Administrative Records to Assess Earnings Reporting Error in the Survey of Income and Program Participation.” Journal of Economic and Social Measurement 26: 173–192.

Roemer, Marc. 2002. “Using Administrative Earnings Records to Assess Wage Data Quality in the March Current Population Survey and the Survey of Income and Program Participation.” Working Paper. Washington, DC: Census Bureau. https://www.census.gov/content/dam/Census/library/working-papers/2002/demo/asa2002.pdf.

Rothbaum, Jonathan, and Edward Berchick. 2019. “Redesign of the Current Population Survey Annual Social and Economic Supplement.” Presented at the Census Scientific Advisory Committee Spring 2019 Meeting, Washington, DC. https://www2.census.gov/cac/sac/meetings/2019-03/current-population-survey-annual-social-economic-supplement.pdf.