Guidelines for Selection of Population Denominators
Purpose
Background and Context
Sources of Estimates
Potential Problems
Revised population estimate series
Statistical imprecision of
estimates
Breaks in estimate series
Changes in geographic
boundaries
Definitions
Population estimates (by OFM)
Intercensal Population Estimates
Postcensal Population Estimates
Population Projection or Forecast
Guidelines
Source of Estimates
Continuity
Small area estimates
Citation
References
Appendix A: Linear
Interpolation
Guidelines
For Selection of Population Denominators (Word Document)
The Assessment Operations Group in the Washington
State Department of Health coordinates the development
of guidelines related to data development and use in
order to promote good professional practice among staff
involved in assessment activities within the Washington
State Department of Health and in Local Health
Jurisdictions in Washington. While the guidelines are
intended for an audience of differing levels of training
related to data development and use, they assume a basic
knowledge of epidemiology and biostatistics. They are
not intended to recreate basic texts and other sources
of information related to the topics covered by the
guidelines, but rather they focus on issues commonly
encountered in public health practice and, where
applicable, to issues unique to Washington state.
Since public health assessment relies extensively on
the use of rates to describe the health status of a
particular community, the denominators used in the
calculation of rates are of considerable importance. The
most commonly used denominator is the number of
residents of an area. For most populations, actual
counts are only available from the decennial census. As
a result, population estimates are typically used as the
basis for rates in the years between censuses. Since
health status tends to vary by such socio-demographic
characteristics as age, race/ethnicity, and sex,
population estimates by these characteristics are
extremely useful, and these estimates are often sought
for many levels of geography including the state,
counties, census tracts, and zip code areas.
The U.S. Census
Bureau enumerates the population every ten years. In
other years, federal, state, and local governments and
various private companies prepare intercensal and
postcensal population estimates and projections for
selected geographic units. Estimates made by these
sources often differ due to differences in their
estimation and projection methods, underlying
assumptions, type of data used, and familiarity with
idiosyncrasies of a population's demographic history.
Since rates are affected by the population estimates
that are used as denominators, the use of a common set
of estimates among those routinely preparing state and
local health statistics is strongly recommended for the
sake of comparability and consistency.
The Forecasting Division of the Washington State
Office of Financial Management (OFM) prepares a number
of population estimates and projections for state,
county, and city populations which are available at the OFM
website. Since the estimates prepared by OFM are
used throughout state and local government for planning,
analysis, and revenue allocation, their population
estimates are considered a standard which should be
adopted whenever possible.
Since OFM does not prepare population estimates at
levels required for some types of health analyses, other
sources for these estimates must sometimes be used.
These should be adjusted to correspond to OFM county and
state totals.
For example, the Research and Data Analysis at the
Department of Social and Health Services (DSHS) prepared
population estimates by age, race/ethnicity, and sex for
state, county, and sub-county areas (e.g., census
tracts, zip codes, school districts, and legislative
districts). These estimates were called the "DSHS
Adjusted Population Estimates" because they were
developed by adjusting sub-county estimates purchased
from a private vendor to correspond to OFM's state and
county population estimates by age and sex and by race
and Hispanic origin. The DSHS Adjusted Population
Estimates are outdated and no longer recommended for use
since they have not been adjusted for the 2000 census
anchor point.
Revised
population estimate series - Postcensal
population estimate series may be revised periodically
as new data are added or as statistical methods that are
used to generate estimates are modified or improved.
When a population estimate series is revised, trends in
birth and death rates can change (Solet, 1997).
Therefore, analysts are advised to investigate the
possible impact of revised population estimates on
conclusions that might be drawn from their analyses,
particularly when using small populations. Also, the
release date for a population estimate series should be
included in references for statistical analyses to
assist in the reconciliation of rates prepared at
different times.
Statistical imprecision of estimates -
When population estimates are promulgated for subgroups
defined by age, race, and sex and for small geographic
areas (a critical need for analyzing local risk and
need), the complexity mounts. While such estimates are
provided in an extraordinarily precise format, the
estimate-based imprecision, or error, increases as the
size of the population subcategories decreases,
particularly for small geographic areas. Because the
magnitude of the error cannot be quantitatively assessed
(at least until the next census), aggregation to larger
geographic units or population groupings is recommended
whenever possible since the estimates for larger areas
are often more accurate.
Breaks in
estimate series - When analyzing trends, one
should use population estimates that form a continuous
and consistent series over time. Unfortunately,
continuous series are not always available. For example,
the population estimates by race after 2000 do not
always correspond to the estimates by race for the
1990s. The state and county estimates by OFM and DSHS
for the 1990s used four mutually exclusive racial
categories: white, black, Asian/Pacific Islander, and
Native American. Estimates produced after 2000 might
reflect changes in the way race data were collected in
the 2000 census: (1) Pacific Islanders were separated
from Asians and (2) respondents were allowed to mark
more than one race, allowing for the creation of
multi-racial groupings. See DOH's Guidelines
for Using Racial and Ethnic Groups in Data Analyses
and OFM's Understanding
Census 2000: Race Category Changes and Comparisons
for more information.
For most populations, change in size and composition
occurs gradually from year to year. As a result,
population estimates tend to reflect a rather gradual
progression. More pronounced changes occur largely due
to migration into or out of an area. In small geographic
areas, rapid change in population size often reflects
changes in housing stock (e.g., loss of housing through
urban renewal or addition through new developments).
Such changes are real and do not represent
"discontinuity" in a population estimate
series.
Changes
in geographic boundaries - IIn small area
analysis, changes in boundaries can make trend analysis
difficult or impossible. For example, census tract
boundaries sometimes change from one census to the next
when such changes are necessary to reflect changes in
the composition of neighborhoods. In Washington State,
approximately 33% of the Census 2000 population lived in
a different census tract number from 1990 to 2000, due
to boundary changes and splits (Mohrman, 2003). Since
zip codes may be modified by the U.S. Postal Service to
reflect changes in postal routes, variation can occur
from year to year. Two options exist for using zip code
areas as a unit of analysis: (1) use constant zip code
boundaries established for one year in a series, usually
the most recent year or (2) use zip codes as they are
defined in each individual year. The analyst should
choose the one that best suits his/her application.
Comparability with numerator data is one consideration.
Population
estimates (by OFM): "
represent the
resident population of an area as defined by the federal
Bureau of the Census. The figures represent all the
persons who usually reside in the area designated. This
includes military personnel, military dependents,
persons living in correctional institutions, and persons
living in nursing homes and other care facilities.
College students are considered residents of the place
where they live while attending school. Seasonal
populations, such as vacationers or migrant farm
workers, are considered residents of the place they
consider their usual residence. Persons with no usual
residence are counted where they are on April 1." (OFM,
September 1999)
Intercensal
Population Estimates: estimated number of
people at dates between two censuses that is derived
from information from both censuses and possibly other
sources (e.g., births, deaths, and data on migration).
Estimates for the 1991-1999 period are intercensal
estimates.
Postcensal
Population Estimates: estimated number of people
at dates following a census that is derived from
information from that census and usually other sources,
such as births, deaths, administrative records that
reflect migration and sometimes earlier censuses.
Estimates for 2001, 2002, 2003 and beyond are postcensal
estimates.
Population
Projection or Forecast: estimated number of
people usually at a future date (Note: Distinctions
between postcensal estimates and projections are
sometimes difficult to make since similar statistical
methods and data sources may be used. See OFM, February
1999 for more details.)
Source of Estimates
The specific population estimate series recommended
for use in a particular analysis depends on the level of
geography; age, race, and sex breakdowns; and years
required for the analysis. Use the most recent estimates
available for the level of detail needed. To quickly
find population estimate files on OFM's website, use the
Key
to Population Estimates.
- State/county population estimates by age and
sex:
Use the most recent OFM "Intercensal and
Postcensal Estimates of County Population by Age and
Sex, 1980 - (most recent year)," for state and
county population by five-year age groups (plus 15-17,
18-19) and sex. This series includes estimates for the
state population. These estimates are available at the
OFM website in both Excel and SAS formats. They are
also available from Catherine
O'Connor (360-236-4251).
- State/county population estimates by race and
ethnicity:
Intercensal estimates by age, sex,
race/ethnicity: Public Health - Seattle and King
County produced this intercensal series for the 1990s.
Single-race estimates were generated using the NCHS
bridging methodology (see Guidelines
for Using Racial and Ethnic Groups in Data Analyses).
County and state level data are available through
Catherine O'Connor (360-236-4251).
Postcensal estimates by race/ethnicity (but
not age): OFM will release estimates for 2001-2003 in
January 2004. Historically, these estimates have been
updated annually. The racial categories will be
American Indian, Asian and Pacific Islander, black,
multiracial, and white.
Postcensal estimates by age, sex, and
race/ethnicity: Provisional 2002 are available on the
OFM website ("Provisional 2002 Estimates of the
State Population by Age, Sex, Race, and Hispanic
Origin, Washington State). However, these estimates
have not been updated to account for the NCHS bridging
methodology (see Guidelines
for Using Racial and Ethnic Groups in Data Analyses).
OFM plans to release an updated 2002 estimate of age,
sex, and race for the state and counties in November
2003. Even-year estimates (2004, 2006, etc.) will be
generated thereafter. Estimates for odd years can be
calculated with linear interpolation (see appendix A
below)
- Sub-county population by age, race/ethnicity, and
sex:
Sub-county estimates are not currently available.
In July 2003, the Vista Advisory Group convened a
workgroup to oversee the development of new
intercensal and postcensal sub-county population
estimates (age by sex by race/ethnicity). New
sub-county estimates are anticipated to be available
for public health assessment by mid-2004. In the
meantime, the AOG recommends the use of Census 2000
data for sub-county denominators at least through
2002, and possibly through 2003, especially for
comparing areas whose rates will be more or less
biased in the same direction by the undercount of the
population.
Continuity
- Use a continuous population estimate series
prepared by a single source whenever possible. If
discontinuities in a time series occur for either
numerators or denominators, analyze the data for
the two periods separately to determine the degree
to which the shift affects the analysis of an
underlying trend. (For these purposes,
discontinuities would be breaks in a data series
due to changes in data collection methods,
estimation techniques, underlying assumptions of
estimation models, or similar methodological
issues.)
- If a population estimate series is revised,
determine whether rates calculated with the
revised estimates would lead to different
conclusions from data published previously. If so,
recalculate rates using the new estimate series
especially if updating a trend analysis or
releasing data in an established publication
series.
- When intercensal estimates are released for a
prior decade, use the intercensal estimates in
place of the postcensal estimates that were
released during that decade.
Small area estimates
- Aggregate population estimates across small
geographic areas (e.g., zip codes, census tracts)
whenever possible to reduce the error and improve
the stability of rates.
- Explain your choice of zip code boundaries -
either constant boundaries over time (usually based
on the most recent date in a time series) or
boundaries that vary from year to year.
- Geo-code addresses in numerator data to correspond
to the geography used in denominators, if possible
(or notify user of possible differences in
geographic boundaries used for numerators and
denominators).
Citation
- Always cite the source of the population
estimates used in a table, analysis, or report,
and include the following information:
- Name of organization and subunit (if
applicable) that prepared the estimates
- Title of the estimate series
- Date(s) to which the estimates pertain
- Date of release or publication
- Optional: file name or internet source (This may assist other users in locating the
exact file that you are using.)
- Example: Office of Financial Management,
Forecasting Division. "Intercensal and
Postcensal Estimates of County Population by Age
and Sex, 1980-1999," October 1999.
References
Mohrman, Mike. Personal Communication. Office of
Financial Management. October 2003.
Office of Financial Management, Forecasting Division.
Washington State 2002 Population Trends. Olympia,
Washington, September 2002.
Solet, David. "The new population estimates and
community assessment: Practical considerations."
Joint Conference on Health, Wenatchee, Washington,
October 1997
Appendix A: Linear
Interpolation
Linear interpolation is a method sometimes used to
obtain population estimates for a time reference between
two known (or estimated) years assuming a linear model
of population change. The method is fairly
straightforward and assumes that the relationship
between the two known points is a linear function:
ƒ(x) = mx + b or, in algebraic terms y = mx
+ b,
where m = the slope of the line (rise/run) and b =
the point at which the line crosses the y-axis
(y-intercept). Two known ordered pairs (X1,Y1) (X2,Y2)
are needed to determine the slope of a line:
m = [(Y2 - Y1) / (X2 - X1)], and
b = Y1 - mX1
Using the population data below, 1999 WA State total
population estimate (P) can be calculated from a linear
function. The variables are:
X1 = 1998 Y1 = 5,750,033
X2 = 2000 Y2 = 5,894,121
m = [(5894121 - 5750033) / (2000 - 1998)] = 72044
b = 5750033 - (72044 x 1998) = -138193879
P1999 = (72044 x 1999) + (-138193879) =
5822077
This calculation can be entered into a spreadsheet to
produce values for a range of substrata as illustrated
below. The percent difference between the population
estimate using this method and OFM official population
estimated (OFM, 2/2003) is 0.15%. This method becomes
less reliable when the time interval between known
points increases.

Guidelines
For Selection of Population Denominators (Word Document)
|