Health Data Guidelines  
You are here: DOH Home » Health Data » Data Guidelines » Population Denominators Employees | Search 
Site Directory
Data Guidelines

Access Washington logo, State of Washington Home Page
Data Guidelines

Guidelines for Selection of Population Denominators

Purpose

Background and Context

Sources of Estimates
Potential Problems
Revised population estimate series
Statistical imprecision of estimates
Breaks in estimate series
Changes in geographic boundaries

Definitions

Population estimates (by OFM)
Intercensal Population Estimates
Postcensal Population Estimates
Population Projection or Forecast

Guidelines

Source of Estimates
Continuity
Small area estimates
Citation

References

Appendix A: Linear Interpolation

Guidelines For Selection of Population Denominator (MS Word, 73 KB)

Purpose

The Assessment Operations Group in the Washington State Department of Health coordinates the development of guidelines related to data development and use in order to promote good professional practice among staff involved in assessment activities within the Washington State Department of Health and in Local Health Jurisdictions in Washington. While the guidelines are intended for an audience of differing levels of training related to data development and use, they assume a basic knowledge of epidemiology and biostatistics. They are not intended to recreate basic texts and other sources of information related to the topics covered by the guidelines, but rather they focus on issues commonly encountered in public health practice and, where applicable, to issues unique to Washington state.

Background and Context

Since public health assessment relies extensively on the use of rates to describe the health status of a particular community, the denominators used in the calculation of rates are of considerable importance. The most commonly used denominator is the number of residents of an area. For most populations, actual counts are only available from the decennial census. As a result, population estimates are typically used as the basis for rates in the years between censuses. Since health status tends to vary by such socio-demographic characteristics as age, race/ethnicity, and sex, population estimates by these characteristics are extremely useful, and these estimates are often sought for many levels of geography including the state, counties, census tracts, and zip code areas.

Sources of Estimates

The U.S. Census Bureau enumerates the population every ten years. In other years, federal, state, and local governments and various private companies prepare intercensal and postcensal population estimates and projections for selected geographic units. Estimates made by these sources often differ due to differences in their estimation and projection methods, underlying assumptions, type of data used, and familiarity with idiosyncrasies of a population's demographic history. Since rates are affected by the population estimates that are used as denominators, the use of a common set of estimates among those routinely preparing state and local health statistics is strongly recommended for the sake of comparability and consistency. 

The Forecasting Division of the Washington State Office of Financial Management (OFM) prepares a number of population estimates and projections for state, county, and city populations which are available at the OFM website. Since the estimates prepared by OFM are used throughout state and local government for planning, analysis, and revenue allocation, their population estimates are considered a standard which should be adopted whenever possible. 

Since OFM does not prepare population estimates at levels required for some types of health analyses, other sources for these estimates must sometimes be used. These should be adjusted to correspond to OFM county and state totals. 

For example, the Research and Data Analysis at the Department of Social and Health Services (DSHS) prepared population estimates by age, race/ethnicity, and sex for state, county, and sub-county areas (e.g., census tracts, zip codes, school districts, and legislative districts). These estimates were called the "DSHS Adjusted Population Estimates" because they were developed by adjusting sub-county estimates purchased from a private vendor to correspond to OFM's state and county population estimates by age and sex and by race and Hispanic origin. The DSHS Adjusted Population Estimates are outdated and no longer recommended for use since they have not been adjusted for the 2000 census anchor point.

Potential Problems

Revised population estimate series - Postcensal population estimate series may be revised periodically as new data are added or as statistical methods that are used to generate estimates are modified or improved. When a population estimate series is revised, trends in birth and death rates can change (Solet, 1997). Therefore, analysts are advised to investigate the possible impact of revised population estimates on conclusions that might be drawn from their analyses, particularly when using small populations. Also, the release date for a population estimate series should be included in references for statistical analyses to assist in the reconciliation of rates prepared at different times.

Statistical imprecision of estimates - When population estimates are promulgated for subgroups defined by age, race, and sex and for small geographic areas (a critical need for analyzing local risk and need), the complexity mounts. While such estimates are provided in an extraordinarily precise format, the estimate-based imprecision, or error, increases as the size of the population subcategories decreases, particularly for small geographic areas. Because the magnitude of the error cannot be quantitatively assessed (at least until the next census), aggregation to larger geographic units or population groupings is recommended whenever possible since the estimates for larger areas are often more accurate.

Breaks in estimate series - When analyzing trends, one should use population estimates that form a continuous and consistent series over time. Unfortunately, continuous series are not always available. For example, the population estimates by race after 2000 do not always correspond to the estimates by race for the 1990s. The state and county estimates by OFM and DSHS for the 1990s used four mutually exclusive racial categories: white, black, Asian/Pacific Islander, and Native American. Estimates produced after 2000 might reflect changes in the way race data were collected in the 2000 census: (1) Pacific Islanders were separated from Asians and (2) respondents were allowed to mark more than one race, allowing for the creation of multi-racial groupings. See DOH's Guidelines for Using Racial and Ethnic Groups in Data Analyses and OFM's Understanding Census 2000: Race Category Changes and Comparisons (PDF, 44 KB) for more information.

For most populations, change in size and composition occurs gradually from year to year. As a result, population estimates tend to reflect a rather gradual progression. More pronounced changes occur largely due to migration into or out of an area. In small geographic areas, rapid change in population size often reflects changes in housing stock (e.g., loss of housing through urban renewal or addition through new developments). Such changes are real and do not represent "discontinuity" in a population estimate series.

Changes in geographic boundaries - IIn small area analysis, changes in boundaries can make trend analysis difficult or impossible. For example, census tract boundaries sometimes change from one census to the next when such changes are necessary to reflect changes in the composition of neighborhoods. In Washington State, approximately 33% of the Census 2000 population lived in a different census tract number from 1990 to 2000, due to boundary changes and splits (Mohrman, 2003). Since zip codes may be modified by the U.S. Postal Service to reflect changes in postal routes, variation can occur from year to year. Two options exist for using zip code areas as a unit of analysis: (1) use constant zip code boundaries established for one year in a series, usually the most recent year or (2) use zip codes as they are defined in each individual year. The analyst should choose the one that best suits his/her application. Comparability with numerator data is one consideration.

Definitions

Population estimates (by OFM): "…represent the resident population of an area as defined by the federal Bureau of the Census. The figures represent all the persons who usually reside in the area designated. This includes military personnel, military dependents, persons living in correctional institutions, and persons living in nursing homes and other care facilities. College students are considered residents of the place where they live while attending school. Seasonal populations, such as vacationers or migrant farm workers, are considered residents of the place they consider their usual residence. Persons with no usual residence are counted where they are on April 1." (OFM, September 1999)

Intercensal Population Estimates: estimated number of people at dates between two censuses that is derived from information from both censuses and possibly other sources (e.g., births, deaths, and data on migration). Estimates for the 1991-1999 period are intercensal estimates.

Postcensal Population Estimates: estimated number of people at dates following a census that is derived from information from that census and usually other sources, such as births, deaths, administrative records that reflect migration and sometimes earlier censuses. Estimates for 2001, 2002, 2003 and beyond are postcensal estimates.

Population Projection or Forecast: estimated number of people usually at a future date (Note: Distinctions between postcensal estimates and projections are sometimes difficult to make since similar statistical methods and data sources may be used. See OFM, February 1999 for more details.)

Guidelines

Source of Estimates

The specific population estimate series recommended for use in a particular analysis depends on the level of geography; age, race, and sex breakdowns; and years required for the analysis. Use the most recent estimates available for the level of detail needed. To quickly find population estimate files on OFM's website, use the Key to Population Estimates.

  • State/county population estimates by age and sex: 

Use the most recent OFM "Intercensal and Postcensal Estimates of County Population by Age and Sex, 1980 - (most recent year)," for state and county population by five-year age groups (plus 15-17, 18-19) and sex. This series includes estimates for the state population. These estimates are available at the OFM website in both Excel and SAS formats. They are also available from Catherine O'Connor (360-236-4251).

  • State/county population estimates by race and ethnicity:

Intercensal estimates by age, sex, race/ethnicity: Public Health - Seattle and King County produced this intercensal series for the 1990s. Single-race estimates were generated using the NCHS bridging methodology (see Guidelines for Using Racial and Ethnic Groups in Data Analyses). County and state level data are available through Catherine O'Connor (360-236-4251).

Postcensal estimates by race/ethnicity (but not age): OFM will release estimates for 2001-2003 in January 2004. Historically, these estimates have been updated annually. The racial categories will be American Indian, Asian and Pacific Islander, black, multiracial, and white.

Postcensal estimates by age, sex, and race/ethnicity: Provisional 2002 are available on the OFM website ("Provisional 2002 Estimates of the State Population by Age, Sex, Race, and Hispanic Origin, Washington State). However, these estimates have not been updated to account for the NCHS bridging methodology (see Guidelines for Using Racial and Ethnic Groups in Data Analyses). OFM plans to release an updated 2002 estimate of age, sex, and race for the state and counties in November 2003. Even-year estimates (2004, 2006, etc.) will be generated thereafter. Estimates for odd years can be calculated with linear interpolation (see appendix A below)

  • Sub-county population by age, race/ethnicity, and sex:

Sub-county estimates are not currently available. In July 2003, the Vista Advisory Group convened a workgroup to oversee the development of new intercensal and postcensal sub-county population estimates (age by sex by race/ethnicity). New sub-county estimates are anticipated to be available for public health assessment by mid-2004. In the meantime, the AOG recommends the use of Census 2000 data for sub-county denominators at least through 2002, and possibly through 2003, especially for comparing areas whose rates will be more or less biased in the same direction by the undercount of the population.

Continuity

  • Use a continuous population estimate series prepared by a single source whenever possible. If discontinuities in a time series occur for either numerators or denominators, analyze the data for the two periods separately to determine the degree to which the shift affects the analysis of an underlying trend. (For these purposes, discontinuities would be breaks in a data series due to changes in data collection methods, estimation techniques, underlying assumptions of estimation models, or similar methodological issues.)
  • If a population estimate series is revised, determine whether rates calculated with the revised estimates would lead to different conclusions from data published previously. If so, recalculate rates using the new estimate series especially if updating a trend analysis or releasing data in an established publication series.
  • When intercensal estimates are released for a prior decade, use the intercensal estimates in place of the postcensal estimates that were released during that decade.

Small area estimates

  • Aggregate population estimates across small geographic areas (e.g., zip codes, census tracts) whenever possible to reduce the error and improve the stability of rates. 
  • Explain your choice of zip code boundaries - either constant boundaries over time (usually based on the most recent date in a time series) or boundaries that vary from year to year. 
  • Geo-code addresses in numerator data to correspond to the geography used in denominators, if possible (or notify user of possible differences in geographic boundaries used for numerators and denominators).

Citation

  • Always cite the source of the population estimates used in a table, analysis, or report, and include the following information:
  • Name of organization and subunit (if applicable) that prepared the estimates
  • Title of the estimate series
  • Date(s) to which the estimates pertain
  • Date of release or publication
  • Optional: file name or internet source (This may assist other users in locating the exact file that you are using.)
  • Example: Office of Financial Management, Forecasting Division. "Intercensal and Postcensal Estimates of County Population by Age and Sex, 1980-1999," October 1999.

References

Mohrman, Mike. Personal Communication. Office of Financial Management. October 2003.

Office of Financial Management, Forecasting Division. Washington State 2002 Population Trends. Olympia, Washington, September 2002.

Solet, David. "The new population estimates and community assessment: Practical considerations." Joint Conference on Health, Wenatchee, Washington, October 1997

Appendix A: Linear Interpolation

Linear interpolation is a method sometimes used to obtain population estimates for a time reference between two known (or estimated) years assuming a linear model of population change. The method is fairly straightforward and assumes that the relationship between the two known points is a linear function:

• (x) = mx + b or, in algebraic terms y = mx + b,

where m = the slope of the line (rise/run) and b = the point at which the line crosses the y-axis (y-intercept). Two known ordered pairs (X1,Y1) (X2,Y2) are needed to determine the slope of a line:

m = [(Y2 - Y1) / (X2 - X1)], and

b = Y1 - mX1

Using the population data below, 1999 WA State total population estimate (P) can be calculated from a linear function. The variables are:

X1 = 1998 Y1 = 5,750,033

X2 = 2000 Y2 = 5,894,121

m = [(5894121 - 5750033) / (2000 - 1998)] = 72044

b = 5750033 - (72044 x 1998) = -138193879

P1999 = (72044 x 1999) + (-138193879) = 5822077

This calculation can be entered into a spreadsheet to produce values for a range of substrata as illustrated below. The percent difference between the population estimate using this method and OFM official population estimated (OFM, 2/2003) is 0.15%. This method becomes less reliable when the time interval between known points increases.

Guidelines For Selection of Population Denominator (MS Word, 73 KB)



DOH Home |  Access Washington |  Privacy Notice |  Disclaimer/Copyright Information

Washington State Department of Health
101 Israel Rd SE, PO Box 47812
Olympia, WA 98504-7812

Last Update : 10/20/2009 11:39 AM
Send inquires about DOH and its programs to the Health Consumer Assistance Office
Comments or questions regarding this web page? Send email to Ramona Nelson.