Social Security Administration's Master Earnings File: Background Information

by Anya Olsen and Russell Hudson
Social Security Bulletin, Vol. 69 No. 3, 2009

The Social Security Administration (SSA) receives reports of earnings for the U.S. working population each year. Earnings data are used to administer the Social Security programs and to conduct research on the populations served by those programs. The administrative needs of SSA and other agencies have changed over time and, as a result, there have been numerous changes to the main source of SSA's earnings data, which is known as the Master Earnings File (MEF). By documenting the history, content, limitations, complexities, and uses of the MEF (and data files derived from the MEF), this article serves as a resource for researchers who use earnings data to study work patterns and their implications. It is also a resource for policymakers and administrators who must understand the data used in administering current-law programs and the data available to inform potential changes to those programs.

Anya Olsen is with the Office of Retirement Policy, Office of Retirement and Disability Policy (ORDP), Social Security Administration (SSA). Russell Hudson is with the Office of Research, Evaluation, and Statistics, ORDP, SSA.

Acknowledgments: The authors thank Michael Compson, Susan Grad, Joyce Manchester, David Pattison, Carolyn Puckett, Dave Shoffner, Jae Song, Hilary Waldron and David Weaver for their helpful comments and suggestions. In addition, comments received from Phil Itzkowitz and his staff in the SSA Office of Systems and from Bert Kestenbaum, Jeff Kunkel, and Bill Piet in the SSA Office of the Chief Actuary were very useful.

The findings and conclusions presented in the Bulletin are those of the authors and do not necessarily represent the views of the Social Security Administration.


Selected Abbreviations
AWI average wage index
CWHS Continuous Work History Sample
EIN employer identification number
ESF Earnings Suspense File
FICA Federal Income Contributions Act
HI Hospital Insurance
HSA Health Savings Account
IRS Internal Revenue Service
MEF Master Earnings File
MQGE Medicare Qualified Government Employment
OASDI Old-Age, Survivors, and Disability Insurance
P.L. Public Law
QC quarter of coverage
SSA Social Security Administration
SSB Social Security Board
SSN Social Security number

Each year employers and the Internal Revenue Service (IRS) send information to the Social Security Administration (SSA) on the earnings of the U.S. working population. SSA uses this information to calculate benefit amounts for all types of beneficiaries, including retired workers, spouses, widow(er)s, children, and the disabled. SSA stores this earnings information as the Master Earnings File (MEF) and because it comprises IRS tax data, it is subject to IRS disclosure rules.1 This file contains data derived from IRS Form W-2, quarterly earnings records, and annual income tax forms. These data include regular wages and salaries, tips, self-employment income, and deferred compensation (contributions or distributions). In addition to calculating Social Security benefits, MEF data are used for policy analysis and research both within and outside SSA. This article is primarily for researchers interested in using data derived from the MEF to better understand the past and present U.S. working population.2 It is also of use to policymakers and administrators who must understand the underlying data used in administering current-law programs and the data available to inform potential changes to those programs. This article examines the history of the data, how the data are collected and entered into the SSA computer systems, the information contained in the data, some limitations and complexities of using the data for research purposes, and how the agency uses the data.

History of the Social Security Program

The original Social Security Act, which was enacted in 1935, required that monthly benefits be paid to qualified individuals aged 65 or older based on their wages from employment before age 65.3 The law tasked SSA's predecessor, the Social Security Board (SSB), with obtaining earnings information in order to calculate benefit amounts in retirement. In order to assign earnings to a specific individual, the SSB established Social Security numbers (SSNs) to allow employers to uniquely identify, and accurately report, earnings covered under the new program. This process began in November 1936 with the assistance of the Post Office Department (Corson 1938). Beginning in 1937, information on earnings up to the taxable maximum of $3,000 was collected for all qualified individuals. This was the maximum amount on which both employers and employees were required to pay their share of taxes (1.0 percent each) under Title VIII of the original Social Security Act. In the 1939 amendments, the taxing provisions were taken out of the Social Security Act and placed in the Internal Revenue Code as the Federal Insurance Contributions Act (FICA) (SSA 2009e).4 FICA taxes (also called payroll taxes) continue to be withheld from wages and earnings up to the taxable maximum, which has increased over the past 70 years. For 2009, Social Security taxes are collected on earnings up to $106,800.

Changes to Coverage

The Social Security Act stipulated who would be covered by the program, meaning those who would pay into the system while working and then receive benefits in retirement. The types and numbers of workers covered by Social Security have changed over time as more categories of workers have been added to the rolls (see Chart 1). Under the original act, all workers in commerce and industry (excluding railroads) were covered by the program.5 In 1940, 24 million workers were in covered employment, which was approximately 52 percent of the employed labor force (SSB 1944). Self-employment earnings information was first collected in 1951 when nonfarm self-employed workers (except members of professional groups) were added to the Social Security program. Additional groups of self-employed workers and professionals were added through legislation passed in 1954, 1956, and 1965 (more information appears in the Self-Employment Earnings section).

Chart 1.
Historical expansion of Social Security coverage: Additional types of workers covered, by date of authorizing legislation
Flowchart linked to text description.
SOURCE: SSA 2008, Table 2.A1.

Various types of agricultural and domestic workers and members of the uniformed services on active duty were also added during the 1950s and 1960s, bringing the number of workers with taxable earnings to 92.1 million by 1970 (SSA 2008). The 1983 amendments to the act added newly hired federal employees, members of Congress, the president and vice president, and newly hired employees of nonprofit organizations. Today, approximately 96 percent of the U.S. workforce (including workers in American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the U.S. Virgin Islands) participate in the Social Security program (SSA 2008).6 As more workers are added to the program, SSA collects an increasing number of earnings records each year. The MEF currently collects earnings information on an annual basis for about 160 million people working in the United States and its territories.

Changes to the Taxable Maximum

In addition to changing coverage laws, changes to the Social Security program and Social Security-related tax laws have also affected the information contained in the MEF (see Chart 2). Since its inception, there have been increases to the maximum income subject to Social Security payroll taxes, which has resulted in higher earnings amounts being stored in the MEF. The first increase in the taxable maximum, from $3,000 to $3,600, occurred in 1951, and four additional increases occurred through 1971. The 1972 Social Security Amendments provided for annual increases in the taxable maximum, proportional to the increase in the national average wage, beginning in 1975.7 Since 1978, earnings information has also been collected for workers and earnings not covered by the program and for those with earnings above the taxable maximum (for more information on changes to the earnings data see the Relevant Time Periods section).

Chart 2.
Selected Social Security program changes affecting Master Earnings File information
Flowchart linked to text description.
SOURCE: SSA 2008, Table 2.A3; SSA 2009e; and Donkar 1981.
NOTES: Entries with effective dates given are shown by date of authorizing legislation.
SSA = Social Security Administration.

In this article "covered earnings" refers to those from employment covered by Social Security or, more specifically, Old-Age, Survivors, and Disability Insurance (OASDI). "Noncovered earnings" refers to those from employment not covered by OASDI. Covered earnings below the taxable maximum are called "OASDI taxable earnings," while those above the taxable maximum are referred to as "OASDI nontaxable earnings." A "quarter of coverage" (QC) is the basic unit for determining whether a worker is insured under the Social Security program. Covered workers must have a specific number of QCs to receive benefits, and the earnings needed to qualify for one QC has changed over time.8


Other major changes to the program, such as the creation of Medicare in 1965, required new information to be added to the MEF. Medicare originally contained two parts: Part A, or Hospital Insurance (HI), provided free of premiums and generally covering inpatient hospital care; and Part B, or Supplemental Medical Insurance (SMI), requiring beneficiaries to pay a monthly premium and covering certain medical services and supplies.9 Beginning in 1966, payroll taxes were collected for HI, generally from those who were also covered by the Social Security program (SSA 2008). Taxes were shared equally by the employer and the employee, and amounted to 0.7 percent of wages. This amount has increased over the years to the current combined tax of 2.9 percent. Today, the combined OASDI and HI payroll tax rate is 15.3 percent—7.65 percent each for the employer and employee.

From 1966 through 1990, the HI payroll tax was collected on earnings up to the Social Security taxable maximum. Under Public Law (P.L.) 101-508, enacted in 1990, the taxable maximum for Medicare in 1991 was increased to $125,000 (the taxable maximum for Social Security that year was $53,400) and was to be indexed to average wages thereafter. However, P.L. 103-66 repealed the Medicare taxable maximum beginning in 1994, and required HI payroll taxes to be paid on all wages and self-employment earnings. This increased the amount of earnings reported to SSA for those who had earnings above the Social Security taxable maximum, and greatly increased the amount of self-employment earnings records in the MEF. All earnings that are subject to OASDI taxes are also subject to HI taxes; however, the reverse is not true. Earnings that are subject to payroll taxes for Medicare purposes, but are not subject to OASDI taxes, are referred to as Medicare-only or HI-taxable earnings in the MEF. In addition, HI- or Medicare-covered earnings are from employment in jobs covered by Medicare, but not OASDI. Since 1994, HI-taxable earnings in the MEF are equal to HI-covered earnings because there is no longer an HI taxable maximum.

Because the Medicare coverage rules are different from those for the OASDI program, the MEF contains information on earnings subject to the Medicare tax but not also to the OASDI tax. Theoretically, this should be the case only for workers with Medicare Qualified Government Employment (MQGE).10 This includes federal government employees hired before January 1, 1984, and state and local government employees hired after March 31, 1986, or whose employment after this date is subject to special conditions of the Social Security Act (CFR 2008).11 The wages paid to those under MQGE are classified in the MEF as HI-taxable earnings. These earnings are used for Medicare purposes and do not qualify the worker for OASDI benefits, as they are not OASDI-taxable.

Self-Employment Earnings

As noted earlier, nonfarm and nonprofessional self-employed workers were added to the program by the 1950 Social Security Amendments (self-employed farm and professional workers were added later) (see Chart 1). Self-employed workers first paid taxes in 1951 at a rate that was less than the combined employer and employee rate for other covered workers. For example, in 1951 the combined Old-Age and Survivors Insurance (OASI) tax rate for employers and employees was 3.0 percent, while the OASI tax rate for the self-employed was 2.25 percent (SSA 2008). The Social Security Amendments of 1983 increased the self-employment tax rate to match the combined employee-employer Social Security and Medicare tax rates effective January 1, 1984 (General Accounting Office 1983). A temporary income tax credit reduced the effective tax rate from 1984 through 1989 (SSA 1990), and starting in tax year 1990, self-employed persons applied a factor of 92.35 percent (100 percent minus 7.65 percent) to their IRS-reported net earnings to determine their Social Security and Medicare taxable net earnings (SSA 2009c, Chapter 12).12 This tax deduction provides similar Social Security and income tax treatment of employees, employers, and self-employed workers (SSA 1990). On their adjusted net earnings, self-employed workers pay a tax rate equivalent to the combined employer and employee OASDI and HI tax rate.13

SSA obtains earnings information for the self-employed electronically from IRS Form 1040 Schedule SE (self-employment tax).14 Before 1991, the IRS sent self-employment earnings data to SSA only when those earnings were reported as Social Security taxable. For a worker with both employment and self-employment earnings, payroll taxes are paid on the employment earnings first. Until 1991, if an individual's wages from employment reached or exceeded the OASDI taxable maximum, SSA would not collect any self-employment information for the worker during that year. If wages were less than the OASDI taxable maximum, then the employee was required to pay OASDI taxes on any self-employment income up to the OASDI taxable maximum. Therefore, SSA collected partial data reflecting self-employment income up to the OASDI taxable maximum. Starting in 1991, additional self-employment earnings—up to the Medicare taxable maximum—were added to the MEF. With the elimination of the Medicare taxable maximum in 1994, the MEF began including all reported self-employment income.

Deferred Compensation

Another change to the MEF resulted from the proliferation of deferred compensation. Deferred compensation is an arrangement in which a portion of an employee's wages are paid out in a year after that in which they are actually earned. This usually occurs with certain retirement plans such as 401(k)s and is usually done to defer the payment of income taxes. In 1984, earnings reports began to include elective deferrals for those workers with wages below the taxable maximum, although deferrals were not explicitly identified and the information was incomplete (Pattison and Waldron 2008).15 As previously discussed, the Social Security taxable maximum is indexed to the growth rate in the national average wage. In 1989, P.L. 101-239 changed the calculation of the national average wage to include certain types of deferred-compensation plans for years after 1991.16 Since 1990, SSA has collected additional information on the aggregate value of workers' deferred compensation from Form W-2 to include in the national average wage calculation, which is used to calculate the annual taxable maximum (and other wage-indexed amounts for the OASDI program).17 Starting in 2004, SSA began to capture information on the specific type of deferred compensation (for example, a 401(k), 403(b), or 457(b) pension plan) and wages that were put into Health Savings Accounts (HSAs).18 This more detailed deferred compensation and HSA information is now contained in the MEF.

Relevant Time Periods

Amendments to the Social Security Act have not only increased the number and types of workers covered by the program, they have also necessitated changing the types of earnings information that are collected by SSA. Other laws passed by Congress and technological changes have also shaped the MEF data. The development of the MEF can be divided into three significant time periods: 1937–1950, 1951–1977, and 1978 to date (see Chart 3).

Chart 3.
Summary of earnings in the Social Security Master Earnings File
Flowchart linked to text description.
SOURCE: Michael Compson, Office of Research, Evaluation, and Statistics, Social Security Administration.
a. For 1978 to 1990, self-employment income is included only to the extent that it is taxable under the Social Security program. In general, during this period there is no way to break out the amount of covered earnings from wages and salary, self-employment income, or earnings from agriculture. Beginning in 1991, there is a difference between the maximum amount of earnings covered under Social Security and the Medicare program. In 1994, the cap on taxable Medicare-covered earnings was eliminated. As a result, 1994 is the firstyear in which earnings data provide a full accounting for wage and salary, tip, and self-employment income.
b. Beginning in 1991, the amount of earnings taxable under the Social Security program and under the Medicare program differed. From 1991 to 1993, there were caps on the total amount of earnings taxable under the Medicare program. In 1994, the cap on the amount of earnings taxable under the Medicare program was eliminated.

1937–1950. Initially, taxable wage reports for individual workers were sent by the IRS' forerunner, the Bureau of Internal Revenue, to SSB and, later, to SSA. Reports were sent on a semiannual basis in 1937, and on a quarterly basis thereafter, for employers with workers covered by the program. These wages were reported for each worker up to the taxable maximum (Fay and Wasserman 1938). This information, sent to SSB on the Employer Report Schedule A (Form 941),19 was then manually transferred to punch cards. The punch card data were entered onto the ledger accounts of individual wage earners and checked against employer totals for accuracy (Corson 1938).20 Recordkeeping of earnings during this period involved the use of collating, sorting, card punching, accounting, and posting machines (Cronin 1985). Noncovered earnings were not reported to SSA in the early years of Social Security because they were not needed for program purposes. Owing to limited storage capacity and the prohibitive costs of converting early earnings data to an electronic format, data for 1937 to 1950 are only available as two aggregate numbers for each worker:21 total Social Security taxable earnings for 1937–1950 and total QCs for 1947–1950.

When needed, there are various procedures for establishing a count of QCs for this 14-year period. First, SSA counts QCs from 1951 forward. If those are insufficient to establish insured status, the QCs from 1947 to 1950 are considered, as well. If these do not provide enough quarters to be insured, then SSA allocates one QC for each $400 of covered earnings from 1937 to 1950.22 If the individual is still not insured, SSA conducts a detailed earnings search of the microfilm records to determine the exact number of QCs. The individual is credited with a QC for any quarter in which he or she had $50 or more in covered earnings during this period; if he or she earned the taxable maximum in a year, four QCs (the most that can be earned each year) are credited.

1951–1977. The Social Security Amendments of 1950 changed the benefit calculation to increase monthly benefits payable (Cohen and Myers 1950). The new benefit calculation "put greater reliance on the use of individual yearly earnings totals" starting in 1951 (Cronin 1985). In addition, SSA began converting files to microfilm in the late 1940s and early 1950s and the installation of the first computer in 1956 greatly increased the use of magnetic tape at the agency (Cronin 1985). The final earnings records during this period contained detailed quarterly and summary earnings information on microfilm, including the claim or disability status of the individual (SSA n.d.). Earnings information for individual workers up to the OASDI taxable maximum continued to be reported quarterly by employers through 1977.23 If an employee reached the taxable maximum during the year, the employer was not required to report any information on that employee in subsequent quarters. After 1951, if an employee's combined wages from two or more employers exceeds the taxable maximum, the record includes wages exceeding the maximum.24 However, for 1951 to 1977, only the total annual earnings amount is contained in the MEF; in later years, the amounts for each employer are also available. Similarly, earnings from self-employment were added to any employee wages and recorded as a yearly total in the MEF during this period.

Total QCs and the quarterly pattern in which wages were earned are also available for each year of data; however, QCs were allocated by different methods, depending on the type of earnings, during this period. For covered wages and tips, a QC was credited for each quarter in which the employee had $50 or more in earnings, up to four QCs a year. An employee with maximum covered earnings was credited with four QCs for the year. A self-employed individual would receive a QC for each $100 of self-employed income, up to four QCs a year; and a QC was allocated for each $100 of agricultural earnings (SSA 2008).25

1978 to date. Under P.L. 95-216, beginning in 1978, SSA began collecting wage and salary information directly from employers on the Form W-2 Wage and Tax Statement on an annual basis. This reduced processing delays and the administrative burdens of reporting and collecting information quarterly. The switch to Form W-2 also meant that SSA had access to information, such as wages above the taxable maximum and wages from noncovered employment, it had not previously received. By the late 1970s, electronic capabilities had increased dramatically from the original punch cards and magnetic tape used by SSA, enabling the agency to store the additional W-2 information (see the Data Available section for the full list of variables contained in the MEF today). In 1978, most W-2 information was received in paper form and was keyed into the SSA earnings record system on magnetic tape at three data keying centers (Cronin 1985). As more employers began to submit their wage reports via electronic media, only one data processing center was needed. Today, employers can go directly to SSA's Business Services Online26 to submit W-2s electronically and to request verification of employees' names and SSNs through the Social Security Number Verification System. Although some earnings information still comes to SSA in paper form, 81 percent of W-2s in fiscal year 2007 were filed electronically by employers, thus reducing the administrative costs of entering and maintaining the earnings data (SSA 2007).

A QC was earned for each $250 of reported covered earnings from all sources (such as wages, tips, and self-employment) up to the annual limit of four in 1978. Since 1979, the amount of earnings needed for each QC has increased annually, proportional to the national average wage. In 2009, a QC is earned for each $1,090 of covered earnings.

Posting Process

Before posting earnings data to an individual's record, SSA verifies that the name and SSN on the W-2 or self-employment income report match information in its Numerical Identification (Numident) file. Records in the Numident are established when an individual applies for an SSN by filling out an SS-5 form.27 SSA enters information from the SS-5 into the Numident file, which contains each person's name, SSN, sex, self-reported race, birth date, and place of birth. Numident records are updated when an individual reports a name change (usually from marriage), requests a correction, asks for a replacement for a lost card, or dies. Verification and date of death comes from state vital records or from public reporting (claimants, family members, or funeral homes) and is stored in a separate record.

SSA receives information on employee wages from the employer on Form W-2 Wage and Tax Statement and Form W-3 Transmittal of Wage and Tax Statements, and on self-employment earnings from IRS data files derived from Schedule SE and the unreported wages and tips line item on Form 1040, U.S. Individual Income Tax Return.28 Form W-2 currently contains the following information:

SSA stores some of the W-2 information as administrative data; most of it is sent to the IRS.

The W-3 is a summary form that contains aggregate earnings information for all employees in the wage report. For SSA to accept the wage amounts on the W-2s, their cumulative total must agree with the W-3. If the data files from these forms balance against one another, the information can then be posted to individual earnings records. SSA receives information from employers and the IRS continuously; therefore the MEF is updated on a weekly basis.

Each year, SSA processes about 245 million employee wage reports submitted by about 6.9 million employers (SSA 2009d). As noted earlier, in order for the earnings on these wage reports to be posted to the MEF, the combination of the name and SSN on Form W-2 must be matched to the Numident.29 If either is different, SSA applies over 20 separate computer routines and other techniques to attempt to find matches for the initial mismatches. Approximately 90 percent of the wage reports received by SSA each year are posted to the MEF without difficulty. After the computerized routines are applied, approximately 96 percent of wage items are successfully posted to the MEF (GAO 2005).

If the name and SSN do not match, even after SSA has performed its computer matching routines, the wages can not be credited to the individual's account. Instead, the earnings are placed in the Earnings Suspense File (ESF). The ESF retains unposted items until they can be correctly assigned and placed in the MEF with a valid name and SSN.30 SSA performs additional operations annually to further attempt to match earnings to individuals' records. To ensure that workers have an opportunity to correct any discrepancies in their earnings records, SSA has since 1979 sent letters to all employees whose names and SSNs can not be matched. In 1994, SSA began also to send letters to employers who submit more than 10 W-2s with nonmatching names and SSNs, when these represent more than 0.5 percent of the W-2s they submit.31 Additionally, beginning in 2000, all workers and former workers aged 25 or older receive an annual Social Security Statement that lists by year all Social Security and Medicare earnings that have been posted to the MEF to date.32 These statements have led to earnings being corrected at earlier ages, when workers can provide evidence of the wages missing from or erroneously posted to their record. Remaining discrepancies can be corrected at the time of benefit application, when individuals scrutinize their earnings records to ensure all their earnings are being used to calculate their monthly benefit amount.

As of October 2007, 275 million wage items for tax years 1937 to 2005 were in the ESF, amounting to $661 billion in wages for which Social Security taxes have been paid (SSA 2009b). Because SSA maintains these data for a long time, individuals with legitimate earnings missing from their earnings records can have them properly posted (there may also be legitimate earnings missing from earnings records that are not in the ESF).33 Researchers using the MEF should understand that earnings records could be incomplete or contain extraneous earnings for certain individuals, and that there is no indicator to warn that an individual's earnings record is erroneous.

Data Available

Once SSA confirms that the employer- and IRS-reported name and SSN of a worker match those recorded in the Numident, his or her earnings are posted to the MEF Earnings Detail Segment, and the MEF Summary Segment is updated (Panis and others 2000).34 The Summary Segment contains annual OASDI-taxable wages and tips and self-employment earnings from 1951 to the present; cumulative taxable earnings for 1937–1950, 1951–1977, and 1978 to date; annual information on MQGE from 1983 to date; cumulative QCs for 1947–1950 and 1951–1977; annual QCs for 1951–1977 (QCs from 1978 to date are computed using reported earnings); and railroad and military earnings indicators.35 The Summary Segment also includes variables such as SSN, race, sex, date of birth, date of death,36 an indicator of earnings prior to 1950, first year of earnings after 1950, and last year of earnings (Panis and others 2000). This segment summarizes all the OASDI- and HI-taxable earnings since 1978 as reported in the Detail Segment and also contains all reported taxable earnings by tax year. Taxable earnings from more than one employer are summarized into one yearly total. For example, if an individual earned $20,000 from each of two different employers, the total earnings would be listed as $40,000 (thus individuals with more than one employer may have earnings that exceed the taxable maximum). The Summary Segment contains no information on employers.

The Earnings Detail Segment includes annual W-2–level data from each of a worker's employers since 1978, as well as self-employment earnings information from the Schedule SE. The W-2 information includes the employer identification number (EIN),37 OASDI and Medicare taxable wages, and total wages reportable as IRS-taxable income on Form 1040 (currently shown in Box 1 of Form W-2, this amount includes wages above the OASDI taxable maximum, noncovered wages, and deferred-compensation distributions, but does not include deferred-compensation contributions). Information from delinquent or corrected W-2s (W-2c's) is included in separate records in the Detail Segment.38 Information from self-employment postings includes the OASDI and Medicare taxable earnings, but does not indicate deferred-compensation contributions. Detail Segment records also contain additional variables pertaining to types of posting. These include an Earnings Report Type (ERT) code, indicating earnings categories such as covered, noncovered, delinquent, self-employment, and unreported tips; and an Earnings Type of Employment (EET) code, which indicates employment categories such as regular, military, self-employed, agricultural, nonprofit, state and local government, household, railroad, MQGE, and workers with tip income (Panis and others 2000). As of December 2005, about three-fourths of earnings in the MEF Detail Segment were taxable wages from Form W-2, with the rest consisting of noncovered W-2 wages, self-employment income, and delinquent W-2 earnings. From 1978 through 2005, about three-quarters of wages came from regular employment, while most of the rest came from tips and from employment in the military, state and local government, agriculture, households, and railroads (Pattison 2007).39

When the Detail Segment process was established in 1978, only two amounts were taken from the W-2: OASDI-taxable earnings (to be added into each person's summary earnings record) and the IRS-taxable wage (to be reported on Form 1040 and used in calculating the national average wage). There are still only two dollar fields on each Detail Segment record; so, in order to handle the information available on more recent W-2s, multiple records may be generated from a single W-2. The initial detail posting, called the primary wage posting, will contain two dollar values: wages subject to federal income taxes (including amounts paid under deferred-compensation plans) and OASDI-taxable earnings. Additional MEF records are created for a W-2 if it includes earnings above the Social Security–taxable maximum in 1991 and later (the excess earnings would be HI-taxable), deferred compensation in 1990 and later, or tips. Additional records are also created for corrected W-2s (W-2c's).

For example, in 2003, the OASDI-taxable maximum was $87,000 and for a worker earning $100,000, two records would be generated. The primary wage posting would show IRS-taxable earnings of $100,000 in the IRS taxable field and OASDI-taxable earnings of $87,000 in the OASDI/HI field. A secondary wage posting for HI-taxable earnings would have $0 in the IRS field and $13,000 in the OASDI/HI field representing HI-taxable earnings above the OASDI-taxable maximum. The OASDI/HI field can be used for other purposes as well, such as OASDI- and/or HI-taxable tips. The ERT and EET indicators show the type of earnings and employment represented in each of the fields in each posting. Depending on the information in an individual's W-2, there may be a single MEF detail record or there may be many records to account for multiple employers, earnings over the taxable maximum, or other types of earnings including tips, HSAs, or deferred compensation.

Limitations and Complexities

As shown above, the SSA Master Earnings File contains extensive historical data on U.S. earnings. However, as with all data sets—especially administrative data sets—there are some limitations and complexities that researchers must acknowledge (although it is important to note that these limitations do not preclude SSA from properly administering the program or determining benefit eligibility or benefit amounts). Foremost, earnings data were first collected for the sole purpose of computing Social Security benefits. In the earlier years, only data on earnings up to the OASDI-taxable maximum were collected because any earnings over this amount did not factor into the benefit formula. This is one limitation of the data prior to 1978. In addition, data on race in the MEF are limited to a single undated entry, which does not account for changes in race coding over time (Scott 1999). Another limitation arises from the existence of the ESF, which includes wage reports that could not be entered into the MEF. This means that not all earnings from 1937 to the present are included in the file. Lastly, there could be errors resulting from the employer failing to report earnings properly or in a timely manner, from clerical errors, or from data being keyed improperly.

Some employer errors can be corrected by submitting a W-2c. However, introducing corrected earnings into the MEF may create additional problems because the previous earnings posting does not get removed when a W-2c is received. Instead, two new postings are created: one includes a negative amount to offset the original wage report, and the other includes the new, correct amount. For example, if a worker's original W-2 stated earnings of $20,000 and the W-2c stated corrected earnings of $15,000, SSA would create two new postings, one reporting −$20,000 and the other reporting the new earnings amount of $15,000. Occasionally, a negative dollar amount can result if more than one correction is made to a worker's earnings. This can happen when both the worker and the employer try to correct a mistake, resulting in a double correction, or a correction is resubmitted while the original submission is still working its way through the system. (These instances were more common in the past, as modernization and enhancements to SSA's computer systems have largely put an end to double corrections.) In addition, some employers may erroneously file a new W-2 instead of a W-2c to correct a mistake. Internal SSA processes check for duplicate postings of the same amount; when detected, the original amount is then offset. However, if the amounts on the W-2s differ, the new amount will be entered without offsetting the old amount, resulting in a false earnings total. The large majority of employers who file W-2c's, however, do so correctly.

Another issue arose beginning in 1978, when earnings information started to come to SSA annually on Form W-2. Even after the switch, some state and local governments were still able to report their employees' earnings under the old quarterly system. Some reported under both the old and new systems. This resulted in some double postings for a few years because different EINs were used under each system, with the quarterly system using a special EIN beginning with the digits 69 (to identify state and local government employers and earnings) and the annual system requiring a regular EIN (IRS 2009). Some state and local governments also used different EINs for reporting to SSA and to the IRS. When different EINs were used for each agency, some earnings were posted twice. This continued until tax year 1981, when SSA no longer allowed state and local governments to report earnings on a quarterly basis (Cronin 1985). Use of EINs with the 69 prefix ended in 1986 (IRS 2009). SSA corrects duplicate earnings records when notified by affected employees.

There are also some issues in the MEF data related to self-employment earnings. Total self-employment earnings reported by individuals and the total number of self-employed workers prior to 1978 can not be determined because of the way these data were collected by SSA (described above). In addition, self-employment earnings that were taxable by Medicare only were not recorded from 1991 through 1993. This was not discovered until 1994 and at that time only data from 1992 and 1993 could be recovered retroactively; for 1991, only self-employed earnings from delinquent reports are available. Therefore, complete self-employment income data for 1991 are not available. In addition, there may be limitations in the data reported to SSA, as they depend on the accuracy of data reported by self-employed individuals on IRS tax forms.

Uses of the Master Earnings File Data

The MEF data are used extensively, but are mainly used for calculating Social Security benefits for individuals and any auxiliary beneficiaries they may have.40 First, the earnings data are used to determine if a person has sufficient QCs to qualify for benefits. SSA also uses earnings records from the MEF to calculate benefit amounts.41 For benefit calculations, an individual's total taxable OASDI earnings for each year (including earnings from different employers and self-employment, military credits, and railroad earnings) are added together to determine total annual earnings up to the taxable maximum.42 The annual earnings amounts are then indexed using the national average wage index (AWI) series, to ensure that benefits reflect the general rise in U.S. wages over the person's working lifetime.43 The sum of indexed earnings in the years of highest earnings is then divided by the number of months in the computation period (35 years for retirement benefits, 35 or fewer for disability and survivors benefits). The result is called average indexed monthly earnings (AIME). AIME is then used in a formula to calculate monthly benefit amounts for OASDI beneficiaries.44

For individuals already receiving benefits, MEF records are used for several programmatic purposes. If a retiree has not reached his or her full retirement age and earns more than a specified amount in a year (in 2009, the amount is $14,160), benefits are reduced $1 for every $2 over the earnings limit (the reductions are offset with an increase in benefits when the retiree reaches full retirement age).45 In addition, each year SSA completes an Automatic Earnings Reappraisal Operation (AERO) or a manual recomputation to determine if any new earnings have been posted to a beneficiary's record. If so, the SSA computer system recalculates the monthly benefit (as described above). New earnings exceeding those in one of the previous 35 highest years of earnings would change the beneficiary's AIME, resulting in higher benefits. The maintenance of earnings information before and even after an individual begins receiving benefits is vital for the operation of the program.

In addition to program-specific uses, MEF data are used to create other files of interest to researchers. One significant example is the Continuous Work History Sample (CWHS). The CWHS is a 1-percent sample of all SSNs issued from 1937 to the current year.46 It contains earnings and employment information derived from the MEF, demographic information derived from the Numident, and annual benefit data derived from the Master Beneficiary Record.47 The MEF data for the CWHS are extracted annually in January, approximately 13 months after the end of the tax year, and are generally available in the spring of that year, after the file is validated (for example, the 2007 CWHS was pulled in January 2009 and will be available in mid-2009). The CWHS is broken into an active file (3.3 million records were in the 2006 file), which includes SSNs with any earnings in the MEF; and an inactive file (1.1 million records were in the 2006 file), which includes SSNs that have never had earnings posted to the MEF.48 The CWHS is used by SSA researchers as well as by those at the Treasury Department and the Congressional Budget Office through Memoranda of Agreement (MOA). IRS law precludes its release to others (Buckler 1988).

In addition to the 1-percent sample described above, the CWHS system produces two annual Employee-Employer (EE-ER) files, a Longitudinal Employee-Employer Data (LEED) file, and an annual Self-Employment (SE) file, all of which are 1-percent samples based on data contained in the MEF. Of the two EE-ER files, one contains covered wages only and the other contains both covered and noncovered wages (this includes nontaxable wages and HI-only taxable wages and covered employment). The EE-ER files also contain age, sex, race, and deferred-compensation contributions variables. The importance of these files is that they show employee and employer location (county and state) and the employer's type of industry, since wages are reported at the employer level in the Detail Segment of the MEF (Panis and others 2000).49 For each tax year, one version of the EE-ER is created when the data for the active CWHS are extracted and a second is created 2 years later, to incorporate any delinquent earnings data and to be added to the LEED file. The LEED file is a 1-percent longitudinal sample of the EE-ER records with data for 1957–2004.50 The industry data contained in the CWHS and EE-ER files come from the IRS Form SS-4 Application for an EIN, income tax returns, and from the Census Bureau.51 These data are categorized according to the North American Industry Classification System (NAICS), which assigns industry codes for the United States, Canada, and Mexico (Census Bureau 2009). The LEED file is used for studies of workers in different geographic regions and different industries over time (Panis and others 2000). The SE file is an annual snapshot of initial self-employment postings to the MEF in the most recent earnings-processing year and contains data sent by IRS to SSA, which is not stored on the MEF but is useful for statistical and research purposes (such as geographic data, farm/nonfarm earnings splits, and use of optional reporting methods).

MEF data are also used for certain statistical publications and data files, such as Earnings and Employment Data for Workers Covered by Social Security and Medicare, by State and County; Benefits and Earnings Public Use-File, 2004; and certain tables in the Annual Statistical Supplement to the Social Security Bulletin.52 A new public-use earnings data file based on a 1-percent random sample of workers is currently being developed in SSA's Office of Retirement and Disability Policy (ORDP) for dissemination on the Social Security web site. This file could be very useful for outside researchers who are interested in long-term U.S. earnings data. In addition, SSA has published many studies using MEF data.53 As noted earlier, because the MEF contains tax return information, access is granted only according to terms of the Internal Revenue Code. The primary organizations that have been granted access to the MEF data for research purposes are the Census Bureau, the Department of Treasury, and the Congressional Budget Office. The University of Michigan obtained the consent of respondents to use MEF data for its Health and Retirement Study (HRS). Outside researchers have coauthored papers with SSA employees who have access to the data, or used Census Bureau data linked with MEF data after being granted access by both the IRS and the Census Bureau.54,55


In 1938, John J. Corson, Director of the Bureau of Old-Age Insurance, noted "[a]s a byproduct of its necessary operations, the records of the Bureau of Old-Age Insurance will in [the] future provide a wealth of new sources of information regarding the working population of the United States." The Master Earnings File was created for the purpose of calculating benefits, but as Corson predicted, it has been used more broadly for improving the administration of the Social Security program. The earnings data available at SSA are used by researchers, analysts, and others to understand the past and present U.S. working populations. As with any large administrative data set, the MEF has some limitations of which researchers should be aware. Nevertheless, it is the premier source of earnings data on U.S. workers.


1 IRS tax data are governed by section 6103 of the Internal Revenue Code. SSA can use it only to record wages and cannot share it with any other federal agency. For more information, see,,id=158487,00.html. For general information on the Master Earnings File, see the Privacy Act Notice at

2 Some earnings data derived from the MEF are available only to a restricted set of researchers, while other earnings data are more widely available. For more information on access to the MEF, see the Uses of the Master Earnings File Data section of this article.

3 For the full text of the original Social Security Act of 1935, see

4 For information on how an employer withholds these taxes from an employee's pay today, see IRS Publication 15, available at

5 Other groups not covered by the original Social Security Act include agricultural workers, domestic servants, casual laborers, maritime workers, employees of federal and state governments or their instrumentalities, those workers employed after reaching age 65, and employees of religious, educational, charitable, and nonprofit organizations (SSB 1938). For more information on the history of coverage, see Myers (1993).

6 The major groups that are not covered include civilian federal employees hired before January 1, 1984; railroad workers; certain employees of state and local governments who are covered under their employers' retirement system; domestic workers and farm workers whose earnings do not meet certain minimum requirements; and persons with very low net earnings from self-employment.

7 The taxable maximum was set by statute for the years 1937–1974 and 1979–1981 (SSA 2009a). Amounts for all other years were determined under the automatic adjustment provision of the Social Security Act, established in the 1972 Social Security Amendments (for more information on these amendments, see For the full list of taxable maximum changes, as well as Social Security and Medicare tax rates and the rates paid by the self-employed up to the maximum amounts, see SSA (2008), Table 2.A3, available at

8 For more information on QCs, see

9 Part C (Medicare Advantage) and Part D (Prescription Drug Plan) have since been added to the Medicare program. For more information on Medicare, see

10 There are instances in the MEF when nongovernmental workers appear to have MQGE wages because their reported Medicare taxable wages are greater than their Social Security wages and the latter is less than the OASDI taxable maximum. These appear to be reporting errors, but are stored on the MEF as if they are Medicare wages in excess of Social Security wages.

11 For more information on these federal government employees, see and for more information on state and local government employees, see

12 Because Social Security benefits are based on an individual's earnings record, self-employed workers receive less credit for earnings in 1990 and later because of the factor applied to adjust their IRS net earnings by the amounts of OASDI and Medicare payroll taxes due.

13 For an explanation of how the self-employed pay Social Security and Medicare taxes, see These taxes are collected under the Self-Employment Contributions Act (SECA).

14 For more information on the Form 1040 Schedule SE, see

15 The MEF OASDI taxable earnings field only includes deferred compensation for Social Security–covered workers whose earnings are below the taxable maximum from 1984 forward. However, MEF began to record deferred compensation for all groups in a separate field in 1990. For more information on deferred compensation, see Pattison and Waldron (2008).

16 For more information on average wages for indexing during this time period, see Clingman and Kunkel (1992).

17 For more information on the national AWI and its use at SSA, see

18 403(b) plans cover most tax-exempt organizations and 457(b) plans cover public sector employees and nongovernmental tax-exempt entities, including hospitals and unions. 408(k) plans (for organizations with fewer than 25 employees) and 501(c) 18(d) plans (employee-funded pension trusts) are also distinguished in the MEF data. For more information on HSAs, see

19 Form 941 was the employer's quarterly federal tax return. Form 942 (employer's quarterly tax return for household employees) and Form 943 (employer's annual tax return for agricultural employees) were also submitted to SSB when applicable.

20 To view an original Social Security wage record, see

21 When detailed information is obtained for an individual prior to 1978, this is posted to the Detail Segment of the MEF. Therefore, information on the Detail Segment for this time period is incomplete and is only posted in special situations (usually if needed for benefit applications). For more information on the Detail Segment of the MEF, see the Data Available section of this article.

22 The MEF includes data on QCs for 1937 to present and for 1950 to present. Calculating the difference enables the determination of the cumulative QCs for 1937–1950.

23 An optical scanner was installed at SSA in 1966 to read and automatically transfer to magnetic tape a significant percentage of the typewritten paper wage reports sent in by employers (Cronin 1985).

24 For example, the taxable maximum was $9,000 in 1972, so if a worker earned $5,500 from one employer and $5,000 from another employer, he would have total earnings in the MEF above the taxable maximum for that year.

25 There are optional reporting procedures for the self-employed that allow them to claim $1,600 in earnings for Social Security purposes even in years when they had net earnings of less than $400. This allows them to remain insured for disability and retirement purposes (the QCs are allocated to specific quarters to best advantage the claimant). Effective for tax year 2008, the maximum amount reportable using the optional method of reporting will be equal to the amount needed for four QCs in a given year. For more information, see

26 See

27 Form SS-5 is available at Originally, SSNs were used strictly to establish and maintain a worker's earnings record. However, as the use of the SSN expanded for tax and banking purposes, people began acquiring SSNs at earlier ages. In 1987, SSA began the enumeration-at-birth (EAB) program in which a parent or legal guardian can request an SSN for a child as part of the birth registration process (Streckewald 2005). A small percentage of SSNs are still requested by working-age and older persons, mostly immigrants.

28 For information on the Forms W-2 and W-3, see For information on Form 1040, see

29 There are two exceptions to posting earnings to the MEF when the name and SSN match the Numident: when the Numident record contains a death indicator, and when the Numident date of birth indicates that the individual is under age 7.

30 In addition to the exceptions mentioned in the preceding endnote, the ESF also contains postings for individuals who claim that earnings on their record are not their own.

31 For more information on these No-Match Letters, see SSA (2009d).

32 For information on the Social Security Statement, see

33 For example, in the past, some workers applied for and received a new SSN when they lost or forgot their original SSN, thereby separating their earnings record under the old number from that of the new one. In the 1980s, SSA established a procedure to determine if multiple numbers have been issued to a single person. Currently, SSA has software that will search the Numident file to prevent issuing a second number to an individual.

34 The Detail Segment includes a Posted Section that contains earnings that are subject to Social Security or Medicare taxes and an Unposted Section that contains related earnings information (such as railroad wages, noncovered earnings, deferred compensation, and HSA contributions). The Unposted Section has no amounts in the Social Security or Medicare taxable fields. For more information on the Posted and Unposted Detail Segments, see Panis and others (2000).

35 Individuals who have military service earnings from active duty from 1957 through 2001 can receive special extra earnings credits that are added to their Social Security records. These credits may qualify individuals for higher Social Security benefits. For more information on military credits, see

36 The date of death on the MEF is considered unreliable after 1978. The Master Beneficiary Record and Numident are the preferred sources for this variable (Panis and others 2000).

37 For more information on EINs, see

38 Delinquent W-2s are those posted after the January that is 13 months after the end of an earnings tax year. For more information on W-2c's, see the Limitations and Complexities section of this article and

39 Researchers and staff in SSA's Office of Retirement and Disability Policy (ORDP) do not use the full MEF. Instead they receive a query that contains two summary earnings research files: adjusted earnings (up to the OASDI taxable maximum) and nonadjusted earnings. These files do not contain all of the variables described in the Data Available section. The Office of Research, Evaluation, and Statistics (ORES), a division of ORDP, has a procedure to obtain a subset of data from the Summary Segment through a finder system that will retrieve data for specific SSNs. A similar procedure is used to retrieve data from the Detailed Segment. ORES stores the data in a format that summarizes the data for a given SSN by year and EIN.

40 Safeguards are established in accordance with the SSA Systems Security Handbook to protect individuals' records. Employees with access to records have been notified of criminal sanctions for unauthorized disclosure of information about individuals. Magnetic tapes or other files with personal identifiers are retained in secured storage areas accessible only to authorized personnel. Microdata files prepared for research and analysis are purged of personal identifiers and are subject to procedural safeguards to assure anonymity.

41 If an individual has earnings records under two SSNs, they are combined for the purpose of calculating benefits.

42 If an individual had some railroad earnings, but not enough to qualify for Railroad Retirement benefits, these earnings would apply toward his or her Social Security benefit. For more information on railroad benefits, see Whitman (2008).

43 Information from the MEF is used to calculate the AWI series for 1951 to present, as mentioned earlier. For more information on AWI's origins and initial construction, see Donkar (1981). To see how the MEF data were used in calculating the AWI series, see for the 1973–1984 period and for the 1985–2007 period.

44 For more information on how benefits are calculated using AIME, see

45 A separate earnings test applies for the year in which a person reaches full retirement age. For example, for an individual reaching full retirement age in 2009, benefits are reduced $1 for every $3 of earnings above $37,680. The earnings test applies only until the month that full retirement age is attained. For more information on the retirement earnings test, see

46 For the list of variables contained in the CWHS, see Panis and others (2000). For more information on the CWHS, see Smith (1989) and the Privacy Act Notice at

47 For more information on the Master Beneficiary Record, see the Privacy Act Notice at

48 The CWHS is currently modernizing, which may change the output file structure.

49 If earnings information comes to SSA electronically, the employee's address is used, but the employer's address is used for earnings information submitted on paper. For the self-employed, the address listed on Form 1040 is used. Prior to 1980, the employer's address was always used. The MEF does not record geographic codes.

50 There is a 2-year lag between the data in the EE-ER file and the data in the LEED file.

51 Because the industry data are Census Bureau data, SSA researchers who access the data must be Special Sworn Status employees and have their projects approved by the Census Bureau. For more information, see

52 See,, and, respectively.

53 For a full listing of these and other SSA studies, see

54 As previously noted, the use of earnings data is governed by section 6103 of the Internal Revenue Code. For Census Bureau approval, projects must meet a purpose in Title 13 Chapter 5 of the U.S. Code. For more information, see

55 Two examples of this type of work include the papers "Uncovering the American Dream: Inequality and Mobility in Social Security Earnings Data Since 1937" by Wojciech Kopczuk, Emmanuel Saez, and Jae Song ( and "The Mis-Measurement of Permanent Earnings: New Evidence from Social Security Earnings Data" by Bhashkar Mazumder (


