Guidelines for Using and Developing Rates for Public Health Assessment
Rate Basics
Why are rates used in public heath assessment?
What is the difference between crude, age-adjusted and
age-specific rates?
Should I calculate a rate when the number of events is small?
Will the rate be compared to other rates?
Guidelines for using and developing rates
Number of events (numerator)
Population at risk (denominator)
Crude rates
Age-adjusted rates, direct method
Age-specific rates
Comparing rates
Unstable rates due to small numbers
Methods for age adjustment
Direct method of age-adjustment
US Standard Populations for direct adjustment
Indirect adjustment
Glossary
References and Resources
Guidelines For Using
and Developing Rates in Public Health Assessment (Word Document)
A rate consists of a numerator and a denominator.
The numerator is the number of health events.
This is often the same as the number of people who experience an event, but for some
health conditions, one person may experience the event more than once. For example, one
individual may have multiple hospitalizations for the same condition in a given year.
To measure incidence or prevalence of the
condition, you usually want to count people. To measure the public health burden, you may
want to count events. Actions based on the data may be different depending on whether the
rate represents many individuals with only one event or a smaller number of individuals
who have had many events.
It is customary to count only events that occur among the population at risk.
Guideline: Number of events (numerator)
The denominator is also known as the population
at risk. Everyone in the population at risk must be eligible to be counted in the
numerator if they have the event of interest. For example, in looking at female breast
cancer, we cannot include men in the population at risk, because men with breast cancer
would not be included in the numerator.
Guideline: Population at risk (denominator)
Once the numerator and denominator are
established, how do we decide which rate is the most appropriate to use. The following
questions are useful.
Much of public health assessment involves
describing the health status of a defined community by looking at changes in the community
over time or by comparing health events in that community to events occurring in other
communities or the state as a whole. In making these comparisons, we need to account for
the fact that the number of health events depends in part on the number of people in the
community. To account for growth in a community or to compare communities of different
sizes, we usually develop rates to provide the number of events per population unit.
Also, the frequency with which health events
occur is almost always related to age. For example, acute respiratory infections are more
common in children of school age because of their immunologic susceptibility and exposure
to other children in schools. Chronic conditions, such as arthritis and atherosclerosis,
occur more frequently in older adults because of a variety of physiologic consequences of
aging. Mortality tends to increase rapidly after the age of 40. In fact, the relationship
of age to risk often dwarfs other important risk factors. Because the relationship of age
to risk is often resistant or impervious to interventions, analysts often remove the
effects of differences in age structure when comparing rates across populations by
calculating age-adjusted and age-specific rates.
Crude rates
Crude rates are recommended when a summary
measure is needed and it is not necessary or desirable to adjust for other factors. For
example, rates of infectious diseases, such as tuberculosis and hepatitis, are usually not
age adjusted, because public health officials are interested in the overall burden of
disease in the total population irrespective of age.
A crude rate is calculated by dividing the total
number of events in a specified time period by the total number of individuals in the
population who are at risk for these events and multiplying by a constant, such as 1,000
or 100,000 [e.g., (numerator/denominator) x constant]. For example, number of deaths in
King County for 1999 (numerator) divided by the population of King County in 1999
(denominator) times 100,000 (constant) gives the 1999 crude death rate per 100,000
population for King County.
Guideline: Crude Rates
Age-adjusted rates
Adjusted rates are used when comparing rates of
health events affected by confounding factors. They are used when comparing different
populations or for comparing trends in a given population over time. Because the
occurrence of many health conditions is related to age, the most common adjustment for
public health data is age-adjustment.
The age-adjustment process removes differences
in the age composition of two or more populations to allow comparisons between these
populations independent of their age structure. For example, a countys age-adjusted
death rate is the weighted average of the age-specific death rates observed in that
county, with the weights derived from the age distribution in an external population
standard, such as the U.S. population. Different standard populations have different age
distributions and the choice will affect the resulting age-adjusted rate. If the
age-adjusted rates for different counties are calculated with the same weights (i.e.,
using the same population standard), the effect of any differences in the counties
age distributions is removed.
Currently, the National Center for Health
Statistics (NCHS) age-adjusts rates using the US 1940 standard population. Other agencies
use the US 1970 Standard. Beginning with 1999 data, federal agencies will age-adjust to
the US 2000 Standard Population.
Age-adjusted rates should be presented when a
single, summary measure is needed, but data analysts should inspect age-specific rates
first. (Choi, 1999)
- Age-adjusted rates can mask important trends. For
instance, while recent trends in cancer mortality show decreasing death rates for people
under 24 and increasing rates for people over 65, the age-adjusted rates changed very
little. (Anderson, 1998)
- Age-adjusted rates can over- or under-estimate
differences. For instance, when age-specific rates of the populations being compared do
not show a consistent relationship (i.e., the trend is not in the same direction for all
age-specific rates or the ratio of age-specific rates is different for different age
groups), the relationship of age-adjusted rates can vary with the choice of a standard
population. If the pattern is not consistent, the use of age-specific rates, rather than
age-adjusted rates, is recommended.
Guideline: US Standard Populations for Direct Adjustment
Guideline: Age-adjusted Rates, Direct Method
When the number of events is relatively small,
the age-specific rates needed to calculate an age-adjusted rate by the direct method are
unstable. This may result in unstable age-adjusted rates when using the direct method of
age-adjustment. Additionally, since the age-adjusted rate calculated by the direct method
provides a somewhat arbitrary summary statistic that depends on the choice of a standard,
it may not provide the best summary measure in explaining health status to communities. An
alternative approach is the development of ratios developed using indirect adjustment.
Guideline: Indirect Adjustment
Age-specific rates
Because age-adjusted rates can mask important
trends or over- or under-estimate differences, age-specific rates are used for comparing
age-defined subgroups when rates are strongly age-dependent. Age-specific rates are also
used when specific causal or protective factors or the prevalence of risk exposures are
different at different ages. For example, at highest risk for head injury are males 15-24
years of age (related to motor vehicle occupant injuries) and those 75 or older (mainly
due to falls). Restricting the age range in the development of a rate is sometimes called
an age-limited rate.
Guideline: Age-Specific Rates
[Guidelines for small numbers covering both
statistical stability and confidentiality are under development.] Rates based on small
numbers of events can fluctuate widely from year to year for reasons other than a true
change in the underlying frequency of occurrence of the event.
Guideline: Small Numbers
When calculating rates, the numerator and
denominator (i.e., events and population) must be defined consistently over time and
place. Areas where public health professionals are most likely to find inconsistencies
include:
- The definition of the health event:
Because analysts may classify diseases differently, it is useful to compare technical
definitions of the number of events before comparing rates. For example, coronary heart
disease is defined differently in Healthy People 2000 than in information from the
National Center for Health Statistics; there are differences in the definitions of several
cancer sites between the National Center for Health Statistics and the National Cancer
Institute.
- The coding scheme: Coding schemes that may
appear to collect the same information often contain subtle differences. For example,
starting with 1999 deaths, state and federal agencies will code cause of death according
to the International Classification of Diseases,10th Revision (ICD-10).
Differences in coding practices between ICD-9 and ICD-10 will affect mortality rates and
trends for many disease categories. It is anticipated that hospitalization data will
continue to be coded according to the ICD-9-Clinical Modification (CM) untill 2001 or 2002
before converting to an ICD-10-CM. Differences in the coding of deaths versus
hospitalizations needs to be considered when planning assessment projects that combine
data from these two sources.
- Data collection and definitions: Data
collection processes or definitions for the number of events or the population at risk may
change over time.
- Geographies: Data collection may vary from
one geographic area to another. This issue is generally more relevant to the number of
events than the population at risk and more relevant to some types of events than others.
In addition to the previous issues, when
comparing age-adjusted rates, the standard population must be the same for all rates to be
compared. Different national, state and local agencies may use different standard
populations when age-adjusting. International agencies also usually use different standard
populations than those used in the United States.
When comparing age-specific rates, if the age
categories are relatively large, it is important to consider the possibility of residual
confounding by age. For example, if the proportions of very old individuals in the group
"65 years and over" are different in two populations being compared, differences
in rates may be a reflection of the difference in the age distribution of the populations.
When age categories are relatively wide, consider developing age-specific rates using
smaller age groups or age-adjust within the broader age group.
Guideline: Comparing Rates
[This is included to provide basic information
until resources permit the development of a guideline for confidence intervals or
significance testing.]
Surveillance data, even if based on complete
counts, may be affected by chance. If variation in the occurrence of the disease is random
and not affected by differential diagnosing, reporting, or other systematic differences,
confidence intervals (CIs) may be calculated to facilitate comparisons over time or
between geographic locations (e.g. counties).
- When the number of events
is small in relation to the population at risk (i.e., the event is rare), calculation of
95% CIs based on the Poisson probability distribution is recommended. (95% CIs correspond
to a p-value of 0.05. If you are making many comparisons, remember that approximately 5%
of the comparisons may be statistically significant due to chance alone.) For crude rates,
the CI can be developed by using a table or computer software to determine the CI around
the number of events and developing a rate based on the high and low confidence limits.
For age-adjusted rates, use the method developed by Fay and Feuer,
1997. In general, if CIs for two separate rates overlap, there is no
statistically significant difference between the two rates.
Narrow CIs for rates indicate with greater
certainty that the calculated rate is a reliable approximation of the true rate, while
wide CIs signal greater variability and less certainty that the calculated rate is a good
estimation of the true rate.
Confidence intervals around rates account for
random fluctuation but not bias. Bias is also known as systematic error. Bias can
occur, for example, when reporting or measuring practices vary by geographic region, time
period, or the person making the report. For example, if a large proportion of a
countys hospitalizations occur in hospitals that are not included in the statewide
hospitalization database (such as, in military and veterans hospitals or out of
state), the hospitalization rate for that county will be biased downward.
- In developing a rate, it is customary to count
only events that occur among the population at risk
- For conditions where one person may be counted
more than once, note whether you are counting events or people.
- Tables containing rates should provide precise
definitions of the number of events (e.g., specify ICD codes) so that people using the
information can calculate comparable rates.
- When reporting rates, the data source for the
number of events should be specified.
- Everyone in the population at risk must be
eligible to be counted in the numerator if they have the event of interest. For example,
in looking at female breast cancer, we cannot include men in the population at risk,
because men with breast cancer would not be included in the number of events.
- See population
denominators for guidelines related to choice of populations at risk. [Link to be
completed.]
- When reporting rates, the data source for the
population at risk should be specified.
- Crude rates are recommended when a summary
measure is needed and it is not necessary or desirable to adjust for age.
- Choose a constant (e.g., rate per 1,000 or rate
per 100,000) that is compatible with commonly reported rates for the topic (e.g., birth
information is generally reported per 1,000 live births; death information is commonly
reported per 100,000 population.)
- Age-adjusted rates are recommended when making
comparisons in the rates of age-related health events between different populations or for
comparing trends in a given population over time.
- Age-adjusted rates are essential for events that
vary with age (e.g., cancer deaths), when comparing populations with different
age distributions.
- Age-adjusted rates should be used only for the
purpose of comparison. Because an age-adjusted rate is based on an external standard
population, it does not reflect the absolute frequency of the event in a population.
- Age-adjusted rates should be presented when a
single, summary measure is needed, but data analysts should inspect
age-specific rates first. (Choi, 1999)
- Use of the US 2000 standard population is
recommended in current analyses, unless there is a need to compare to data that have been
adjusted to another standard. In the latter case, you must use the standard population
used in the comparison data. See US Standard Populations.
- The development of age-specific rates is
recommended before developing age-adjusted rates to determine whether the populations
being compared show a consistent relationship among age-specific rates. If the pattern is
not consistent, use of age-specific rates, rather than age-adjusted rates, are
recommended.
- Age-specific rates are recommended for comparing
age-defined subgroups between or within populations when rates are strongly
age-dependent.
- Age-specific rates are recommended when specific
causal or protective factors or the prevalence of risk exposures are different at
different ages.
- In defining sub-groups for age-specific rates,
select age ranges appropriate to the condition of interest.
- Only compare rates when the numerator and
denominator (i.e., events and population) are defined consistently over time and place.
Look for
- Consistency in definition of event
- Consistency in coding scheme
- Consistency over time
- Consistency among geographies
- If comparing age-adjusted rates, compare rates
that have been adjusted to the same standard population.
- When comparing age-specific rates, if the age
categories are relatively large, it is important to consider the possibility of residual
confounding by age.
[Guidelines for small numbers covering both statistical stability and
confidentiality are under development.]
Rates based on small numbers of events can
fluctuate widely from year to year for reasons other than a true change in the underlying
frequency of occurrence of the event.
- Calculation of rates is not recommended when
there are fewer than five events in the numerator, because the calculated rate is unstable
and exhibits wide confidence intervals.
- Small counts should be included, where possible,
even if the rates are not reported, so that the counts can be combined into larger totals
(for example, three or five year averages) which would be more stable.
Multiply the age-specific rates in the target population by the age distribution
of the standard population.

Where m is the number of age groups, di
is the number of cases (events or people) in age group i, Pi is the
population in age group i, and si is the proportion of the
standard population in age group i. This is a weighted sum of Poisson random
variables, with the weights being (si / Pi).
Currently, the National Center for Health
Statistics (NCHS) age-adjusts rates using the US 1940 standard population with eleven age
groups. These groups are: less than 1 year, 1-4 years and nine 10-year age groups
beginning at age 5. The National Cancer Institute (NCI) uses the US 1970 standard
population with eighteen 5-year age groups.
Starting with 1999 deaths, the estimated U.S.
population in 2000 will become the standard population for age-adjusting death rates and
cancer incidence rates. This will affect the size of age-adjusted rates, since the new
standard will have a higher concentration of the middle-aged and older population (see
Anderson, 1998, for the population pyramids for the two populations). Generally, the
magnitude of age-adjusted death rates will increase for causes of death with higher rates
in older people, and decrease for causes of death with higher rates in younger people. The
age-adjusted mortality rate for total deaths will also be higher with the new population
standard. The NCHS will use the same age groups as in the 1940 standard. The NCI may
continue to use the eighteen 5-year age groups.
Below are the US 1940, 1970 and 2000 standard
populations.
| 1940 Standard |
1970 Standard |
2000 Standard
(10-yr. age groups) |
2000 Standard
(5-yr. age groups) |
age group |
proportion |
age group |
proportion |
age group |
proportion |
age group |
proportion |
| <1 |
0.015343 |
0 4 |
0.084416 |
<1 |
0.013818 |
0 4 |
0.069135 |
| 1 - 4 |
0.064718 |
5 9 |
0.098204 |
1 4 |
0.055317 |
5 9 |
0.072533 |
| 5 14 |
0.170355 |
10 14 |
0.102304 |
5 14 |
0.145565 |
10 14 |
0.073032 |
| 15 24 |
0.181677 |
15 19 |
0.093845 |
15 24 |
0.138646 |
15 19 |
0.072169 |
| 25 34 |
0.162066 |
20 24 |
0.080561 |
25 34 |
0.135573 |
20 24 |
0.066478 |
| 35 44 |
0.139237 |
25 29 |
0.066320 |
35 44 |
0.162613 |
25 29 |
0.064529 |
| 45 54 |
0.117811 |
30 34 |
0.056249 |
45 54 |
0.134834 |
30 34 |
0.071044 |
| 55 64 |
0.080294 |
35 39 |
0.054656 |
55 64 |
0.087247 |
35 39 |
0.080762 |
| 65 74 |
0.048426 |
40 44 |
0.058958 |
65 74 |
0.066037 |
40 44 |
0.081851 |
| 75 84 |
0.017304 |
45 49 |
0.059622 |
75 84 |
0.044842 |
45 49 |
0.072118 |
| 85+ |
0.002770 |
50 54 |
0.054643 |
85+ |
0.015508 |
50 54 |
0.062716 |
|
|
55 59 |
0.049077 |
|
|
55 59 |
0.048454 |
|
|
60 64 |
0.042403 |
|
|
60 64 |
0.038793 |
|
|
65 69 |
0.034406 |
|
|
65 69 |
0.034264 |
|
|
70 74 |
0.026789 |
|
|
70 74 |
0.031773 |
|
|
75 79 |
0.018871 |
|
|
75 79 |
0.026999 |
|
|
80 84 |
0.011241 |
|
|
80 84 |
0.017842 |
|
|
85+ |
0.007435 |
|
|
85+ |
0.015508 |
When the number of events in a community is small, or when developing statistics for
use in communities concerned about the number of events, compare the observed number of
events to the expected number, using indirect age-adjustment or age- and sex-adjustment.
- To develop a count of expected events using
indirect adjustment, apply the age-specific rates in a larger population
(e.g., Washington state) to the number of people in the age-specific
group in the population of interest, and total the results for all age
groups. Ideally, the larger population should be large enough that the
rates in that population are stable (i.e., exhibit little random
variation).
- Compare the observed number of events (usually abbreviated "O") and the
expected number obtained using indirect adjustment (usually abbreviated "E").
Generally, if the confidence interval (CI) around O does not include E, the observed
number of events is statistically significantly different from the expected. (Breslow and
Day, 1987) The Poisson
confidence limits around O can be obtained from standard tables or can
be calculated using several software packages. This method assumes that
E is developed from stable rates (i.e., the standard population is large
enough that there is little random variation in the rates for that
population.) A more conservative estimate of statistical significance
(and the estimate that should be used when the rates in the larger
population are not as stable as one would like) is to develop CIs around
both E and O. If the CIs do not overlap, the difference between O and E
is statistically significant.
- A more precise method of determining whether the observed number of events is different
from the expected is to develop a ratio of O to E (O/E) and conducting a statistical test
to determine whether the ratio is statistically significant. Most biostatistics textbooks
provide methods for conducting the statistical test. Fisher and van Belle (1993) provide
an approximation for large samples only. Rosner (1990) provides methods for small and
large samples.
Rate: A rate is a measure of the frequency of an event per population unit. The
use of rates, rather than raw numbers, is important for comparison among populations,
since the number of events depends, in part, on the size of the population.
Numerator: In calculating rates, the numerator is the number of events in a
specified population.
Denominator: In calculating rates, the denominator is the number of people a
specified population. Everyone in the denominator must be eligible to be counted in the
numerator. The denominator is often called the "population at risk."
Crude rate: A crude rate is calculated by dividing the total number of events in
a specified time period by the total number of individuals in the population who are at
risk for these events and multiplying by a constant, such as 1,000 or 100,000 [e.g.,
(numerator/denominator) x constant].
Age Adjustment: Age-adjustment is the process by which differences in the age
composition of two or more populations are removed, to allow comparisons between these
populations in the frequency with which an age-related health event occurs.
Age-adjusted rate (direct adjustment): An age-adjusted rate adjusted by the
direct method is "
the rate that would occur if the observed
age-specific
rates were present in a population with an age distribution equal to
that of a standard population." (Anderson, 1998)
Age-specific or age-limited rate: An age-specific rate is a rate in which the
number of events and population at risk are restricted to an age group (e.g., the birth
rate for women age 15 to 19; death rate for people age 45 to 64).
Standard population: The standard population refers to the choice of populations
used in developing age-adjusted rates.
Policy Statement on Changing the
Population Standard Used for Age Adjusting Death Rates in DHHS Publications
Anderson RN, Rosenberg HM. Age standardization
of death rates: Implementation of the Year 2000 Standard. National Vital Statistics
Reports; vol. 47 no. 3. Hyattsville, Maryland: National Center for Health Statistics,
1998.
Breslow NE and Day NE. Statistical Methods in Cancer Research: Volume II The
Design and Analysis of Cohort Studies. New York: Oxford University Press, 1987.
Choi BCK, de Guia NA, and Walsh P. Look before you leap: Stratify before you
standardize. American Journal of Epidemiology, 149: 1087-1096, 1999.
Fay MP and Feuer EJ. Confidence intervals for directly standardized rates: A method
based on the gamma distribution. Statistics in Medicine, 16: 791-801, 1997.
Fischer LD and van Belle G. Biostatistics: A Methodology for the Health Sciences.
New York: John Wiley and Sons, Inc., 1993.
Kuller LH. Age-adjusted death rates: A hazard to epidemiology? [editorial] Annals of
Epidemiology, 9(2): 91-2, 1999.
Last JM [ed]. Dictionary of Epidemiology, 3rd edition. New York:
Oxford, 1995.
Rosner, BA. Fundamentals of Biostatistics, 3rd Edition. Boston:
PWS-Kent Publishing, 1990.
Selvin S. Statistical Analysis of Epidemiologic Data, 2nd Edition.
New York: Oxford University Press, 1996.
Sorlie PD, Thom TJ, Manolo T, Rosenberg HM, Anderson RN and Burke GL. Age-adjusted
death rates: Consequences of the year 2000 standard. Annals of Epidemiology,
9:93-100, 1999.
|