Jessica Ochs; Sherry L. Roper; Susan M. Schwartz

12.5 Epidemiologic Measures

Learning Outcomes

By the end of this section, you should be able to:

12.5.1 Define significant terms related to disease occurrence in a population.
12.5.2 Discuss mathematical terms used in epidemiology.
12.5.3 Describe epidemiological measures used to define and quantify health problems in and across defined populations.
12.5.4 Identify analytic methods for calculating key measures of morbidity, mortality, and measures of association.
12.5.5 Describe possible sources of error in epidemiological studies.
12.5.6 Understand epidemiological criteria used to establish causal relationships.

Epidemiologists often categorize the amount of a disease present in a community as a specific level of disease: endemic, hyperendemic, sporadic, epidemic, outbreak, or pandemic. Public health officials use this information to assist in planning appropriate interventions. The endemic level is the continual and constant presence of a disease within a geographic area—the observed level in a defined area; it may also be referred to as the usual rate of disease at any given time or the baseline level.

Persistent high levels of disease in a defined area are characterized as hyperendemic. Diseases that occur occasionally at irregular intervals are considered sporadic or infrequent. When the level of disease in a defined area rises above the endemic levels—a sudden increase in the number of cases of a disease above what is normally expected—it is referred to as an epidemic. Both infectious and noninfectious diseases can become epidemics. An outbreak is an epidemic that affects a limited geographic area. A pandemic is a worldwide epidemic. See Pandemics and Infectious Disease Outbreaks.

Common Epidemiological Measures

Using epidemiological measures is one way of knowing when there is an excess of what is expected (Celentano & Szklo, 2019). Common frequency measures in epidemiology are ratios, proportions, and rates.

Ratios

A ratio is a comparison of any two values, calculated by dividing one interval by the other. The numerator and denominator do not have to be related. After dividing the numerator by the denominator, the result is expressed as the result “to one” or “:1” See Calculating Ratios for more detail on the calculation.

Ratios can be used as a descriptive measure, such as describing the man-to-woman ratio of participants in a study. They can also be used in calculations for the occurrence of illness or death between two groups. Death-to-case ratio is used as a measure of illness severity because it refers to the number of deaths attributed to a disease during a specific period of time divided by the number of new cases during the same period. As an example, rabies has a death-to-case ratio of almost 1, meaning that almost everyone who develops it dies from it (CDC, 2012).

Calculating Ratios

Recall that:

$10^{0} = 1$

$10^{1} = 10$

$10^{2} = 10 \times 10 = 100$

$10^{3} = 10 \times 10 \times 10 = 1,000$

Basic Calculation for Ratios:

(Numerator \div Denominator) \times 10^{n}

Calculating Ratios: Example 1

You are a research nurse reviewing the medical histories of the study participants. Given this study’s parameters, they are categorized as having hypertension or not having hypertension.

Men with hypertension: 305

Men without hypertension: 4,702

Based on the data, calculate the ratio of men without hypertension to men with hypertension. Since this will be a one-to-one ratio, you will use 10⁰.

Ratio $= \frac{4, 702}{305} \times 10^{0} = 15.4:1$ (15.4 men without hypertension to 1 with hypertension)

Calculating Ratios: Example 2

Imagine you are the public health nurse for a county in a rural part of the United States. You have been tasked with calculating the ratio of county citizens to the number of health clinics in the county—in other words, how many county residents each health clinic must serve.

Number of health clinics: 8

County population: 9,000

Ratio $= \frac{9, 000}{8} \times 10^{0} = 1,125:1$ (Each clinic must serve 1,125 county citizens.)

(See Centers for Disease Control and Prevention, 2012.)

Proportions

A proportion is a form of ratio where the numerator represents a subset of the denominator. An example is looking at the percentage of a population that is younger than 18 years. Proportions can be communicated as a decimal, fraction, or percentage. See Calculating Proportions for more details on the calculation. Proportions are often used as descriptive measures, such as the proportion of children in a community vaccinated against the flu or the proportion of individuals at a boarding school who developed illness (CDC, 2012). Proportions also describe the extent of disease attributable to a particular exposure. Proportionate mortality is the proportion of deaths in a defined population during a defined time period that are attributed to different causes. Each cause is communicated as a percentage of all deaths, where the sum of causes equals 100 percent (CDC, 2012). These proportions are not rates, as the denominator is all deaths.

Calculating Proportions

Basic Calculation for a Proportion

(Number of persons or events with a particular characteristic ÷ Total number of persons or events, with the numerator being a subset of this total number) $\times 10^{n}$

With proportions, 10ⁿ is usually expressed with $n = 2$ , or 100, and conveyed as a percentage.

Calculating a Proportion

Refer to Calculating Ratios and the study looking at hypertension.

Men with hypertension: 305

Men without hypertension: 4,702

Based on the data, calculate the proportion of men who had hypertension.

Numerator (305) plus Denominator (4,702) = Total number of men; $(305 + 4,702) = 5,007$

Proportion of men who had hypertension $= \frac{305}{5, 007} \times 10^{2} = 6.09 %$

(See Centers for Disease Control and Prevention, 2012.)

Rates

In epidemiology, a rate is a measure of how often an event occurs in a specified population over a defined period of time. Rates are useful for comparing disease frequency in different locations, at different times, or among different groups of individuals, often considered a measure of risk (CDC, 2012). Epidemiologists use rates to describe incidence, prevalence, case-fatality, and attack rates. It is important to note that attack rate, prevalence rate, and case-fatality rate are not considered “true” rates by some as they are not expressed in units of time, but these proportions provide valuable information in looking for patterns and using data for comparison (CDC, 2012).

Morbidity is another term for having a disease, illness, or medical condition and includes disease, injury, and disability. Measures of morbidity characterize the number of individuals in a population who become ill or are ill at a specified time. Commonly used measures of morbidity are attack rate, also known as incidence proportion, secondary attack rate, incidence rate, point prevalence, and period prevalence. See Incidence Rate, Prevalence Rate, Attack Rate, and Case-Fatality Rate for more detailed definitions of these rates. The mortality rate measures the frequency of death in a defined population during a specific time interval. See Mortality Rate for more details on the various mortality rates often used.

Incidence Rate

An incidence rate is a proportion in which the numerator is all new cases of a disease or health condition during a given period of time. The denominator is the population at risk during the same period. Incidence rates describe how quickly a disease or illness occurs in a specified population. Incidence rates of a health condition in a population are often expressed as the incidence rate per 100,000 population.

Example:

In 2020, 70,000 new cases of disease X were reported in the United States. The estimated midyear U.S. population was approximately 319,000,000. Calculate the incidence rate of disease X in the United States in 2020.

Numerator: 70,000 new cases of disease X

Denominator: 319,000,000 estimated midyear population

$10^{n} = 10^{5} = 100,000$

$Incidence rate = (\frac{70, 000}{319, 000, 000}) \times 100,000$

$Incidence rate = 21.94 new cases of disease X per 100,000 population$

(See Centers for Disease Control and Prevention, 2012.)

Prevalence Rate

A prevalence rate is the proportion of a population with a health condition at a certain point in time or over an interval of time. An example is how many individuals in a population have or have had influenza in the month of February. Unlike incidence, prevalence includes all cases of a particular disease, both new and pre-existing, in the population at the specified time. Incidence is limited to new cases only. Prevalence is often measured for chronic diseases as they have long durations and unclear onsets.

Calculating Prevalence of a Disease

Numerator: All new and pre-existing cases during an established time period

Denominator: Population during the same period

Multiply the result by $10^{n}$ . A value of 100 is most often used for $10^{n}$ ; therefore, $10^{n} = 10^{2} = 100$ .

Point prevalence refers to the prevalence measured at a specified point in time, the proportion of individuals with a disease on a specified date.

Period prevalence refers to prevalence measured over a span of time.

(See Centers for Disease Control and Prevention, 2012.)

Attack Rate (Incidence Proportion)

The attack rate is the proportion of a population that develops an illness during an outbreak. Another term for an attack rate is incidence proportion, and it can be thought of as a measure of risk.

Calculating Attack Rate

In a food poisoning outbreak, 50 out of 150 individuals develop nausea, vomiting, and diarrhea (n/v/d) after attending a party where the 150 individuals all ate cheddar cheese. To measure the food-specific attack rate, the numerator is the number of individuals who ate a specific food and became ill divided by the total number of individuals who ate that food.

Numerator: 50 individuals who ate the cheddar cheese and developed n/v/d

Denominator: 150 individuals who ate the cheddar cheese

Multiply the result by $10^{n}$ . A value of 100 is most often used for $10^{n}$ ; therefore, $10^{n} = 10^{2} = 100$ .

Food-specific attack rate $= (\frac{50}{150}) \times 100 = 33.3 %$

Calculating Incidence Proportion (also known as risk)

In a study of men with and without hypertension, 165 of the men with hypertension (out of a total of 305 men with hypertension) had died during the follow-up study 10 years later. Calculate the risk of death for these men.

Men with hypertension: 305

Numerator: 165 deaths among men with hypertension

Denominator: 305 men with hypertension

$10^{n} = 10^{2} = 100$

$Incidence proportion (Risk) = (\frac{165}{305}) \times 100 = 54.1 %$

(See Centers for Disease Control and Prevention, 2012.)

Case-Fatality Rate

A case-fatality rate is the proportion of individuals with a disease who die from it. With this proportion, the numerator can only include deaths among individuals included in the denominator, but the time periods do not need to be the same. It is a measure of disease severity.

For example, a multistate outbreak of hepatitis A was traced to contaminated strawberries. There were 300 cases and two deaths as a result of the infection.

$Case-fatality rate = (\frac{2}{300}) \times 100 = 0.67 %$

(See Centers for Disease Control and Prevention, 2012.)

Mortality Rate

A mortality rate is the measure of death in a defined population during a specific time period.

Numerator = Deaths occurring during a specific time period

Denominator = Size of the population where deaths occurred

Multiply the result by $10^{n}$ . Values of 1,000 or 100,000 are often used for $10^{n}$ ; therefore, $10^{n} = 10^{3}$ or $10^{5} .$

Commonly Used Measures of Mortality:

Crude death rate is the mortality rate from all causes of death for a population.

Numerator: Total number of deaths during a given time interval

Denominator: Mid-interval population

$10^{n} = 10^{3}$ or $10^{5}$

For example, suppose that a total of 3,000,000 deaths occurred in 2020. The estimated population was 400,000,000 at the midpoint of 2020. Therefore, the crude mortality/crude death rate in 2020 was $(\frac{3, 000, 000}{400, 000, 000}) \times 100,000 = 750$ , or 750 deaths per 100,000 population.

Cause-specific death rate is the mortality rate from a specific cause in a population.

Numerator: Number of deaths assigned to a specific cause during a given time interval

Denominator: Mid-interval population

$10^{n} = 10^{5}$

For example, in 2020, there were 91,799 drug overdose deaths in the United States (CDC, 2022b). The midyear population of 2020 was 333,287,557. Therefore, the cause-specific (drug overdose) mortality rate was 27.5 per 100,000 population.

Infant mortality rate

Numerator: Number of deaths among children < 1 year of age during a specified time period

Denominator: Number of live births during the same time period

$10^{n} = 10^{3}$

Maternal mortality rate

Numerator: Number of deaths assigned to pregnancy-related causes during a specified time period

Denominator: Number of live births during the same time period

$10^{n} = 10^{5}$

(See Centers for Disease Control and Prevention, 2012.)

Measures of Association

Epidemiologists seek to identify causal relationships between agents and disease by first assessing whether they are associated. An observed association between an exposure or an agent and a disease may indicate a causal relationship, but it may be a result of an error with the sampling method used (Green et al., 2011). Measures of association, such as relative risk (risk ratio), rate ratio, odds ratio, and attributable risk, assess the degree to which the risk of disease increases when exposed to an agent, thereby demonstrating the strength of an association—or, put another way, the strength of a causal relationship. Measures of association essentially compare disease occurrence among two groups—one being the primary interest group and the other being the comparison group—and serve as epidemiological criteria to establish causal relationships (CDC, 2012).

Relative Risk

Relative risk (RR), or risk ratio, compares the risk of a health event among one group with the risk among another group, the comparison group. RR is the ratio of the incidence proportion of the health event in exposed individuals (or the group of primary interest) to the incidence proportion in unexposed individuals (or the comparison group). RR of 1.0 indicates equal risk between the two groups. RR greater than 1 indicates an increased risk for the group in the numerator, the exposed group. RR less than 1 indicates a decreased risk for the exposed group, signaling the exposure may protect against the disease or health event occurrence. See Relative Risk for more information on the calculation of RR and an example.

Relative Risk

Calculating Relative Risk (RR)

Numerator: Risk of disease or health event, the incidence proportion, in primary interest group

Denominator: Risk of disease or health event, the incidence proportion, in comparison group

Example

A researcher is studying 300 individuals exposed to a potential carcinogen and 500 individuals who were not exposed to this potential carcinogen. After a five-year follow-up, 125 of the exposed individuals are diagnosed with the disease, and 75 of the unexposed individuals are also diagnosed with the disease. The relative risk of contracting the disease is calculated by:

Incidence proportion of the primary interest group (exposed group) is $(\frac{125}{300}) = 0.42$

Incidence proportion of the comparison group (unexposed group) is $(\frac{75}{500}) = 0.15$

Therefore, the RR is $(\frac{0.42}{0.15}) = 2.8$ . This RR of 2.8 suggests the risk of disease in the exposed group is 2.8 times as high as the risk of disease in the unexposed group.

Rate Ratio

A rate ratio compares the incidence rates or mortality rates of two groups. Similar to the risk ratio, the two groups are usually differentiated by exposure to a suspected causative agent and a comparison group or are differentiated by demographic factors such as gender or age. The rate for the primary interest group (exposure group) is divided by the rate for the comparison group. A rate ratio of 1.0 indicates equal risk between the two groups. A rate ratio greater than 1 indicates an increased risk for the group in the numerator, the exposed group. A rate ratio less than 1 indicates a decreased risk for the exposed group, signaling the exposure may protect against the disease or health event occurrence. See Rate Ratio for more information on the calculation of rate ratio and an example.

Rate Ratio

Calculating Rate Ratio

Numerator: Incidence rate (or mortality rate) in primary interest group (exposure group)

Denominator: Incidence rate (or mortality rate) in comparison group (non-exposed group)

Example

A public health nurse is investigating a perceived increase in flu-related deaths in January–March of 2022 in a large city compared to flu-related deaths the year prior in January–March 2021 in the same city. In 2021, there were 129 flu-related deaths among a midyear population of 500,000. In 2022, there were 310 flu-related deaths among a midyear population of 502,000. Calculate the rate ratio as follows:

Mortality rate of the primary interest group (2022 incidence rate) is $(\frac{310}{502, 000}) \times 1,000 = 0.62$

Mortality rate of the comparison group (2021 incidence rate) is $(\frac{129}{500, 000}) \times 1,000 = 0.26$

Therefore, the rate ratio is $(\frac{0.62}{0.26}) = 2.4$ . This suggests the risk of disease in the primary interest group is 2.4 times as high as the risk of disease in the comparison group, suggesting that in 2022 there was a higher mortality rate due to flu than in the year prior.

Odds Ratio

The odds ratio (OR) is similar to the RR as it quantitatively expresses the association between an exposure and a disease or health outcome. The OR is most often used to estimate the RR in case-control studies when the disease being investigated is rare. It is the ratio of the odds of developing a disease when exposed to an agent to the odds of developing the disease when not exposed. Investigators can use the OR to estimate the RR since RRs cannot be calculated from a typical case-control study. See Odds Ratio for more information on the calculation of the odds ratio and an example.

Odds Ratio

Calculating Odds Ratio in a Case-Control Study

Numerator: Odds a case was exposed

Denominator: Odds a control was exposed

Odds ratio $= \frac{(a \times d)}{(b \times c)}$ where

a = number of individuals exposed and with disease

b = number of individuals exposed but without disease

c = number of individuals unexposed but with disease

d = number of individuals unexposed and without disease

Example

	Cases with Disease	Controls with No Disease	Totals
Exposed	$a = 190$	$b = 2,000$	2,190
Not exposed	$c = 140$	$d = 10,000$	10,140
Total	330	12,000	12,330

Table 12.2

Calculate the OR as follows:

$\frac{(190 \times 10, 000)}{(2, 000 \times 140)} = 6.79$

Now, using the same data in the table, calculate the RR as follows:

$\frac{(\frac{190}{2, 190})}{(\frac{140}{10, 140})} = 6.28$

The OR and the RR are close as the OR provides a realistic estimation of the RR.

Attributable Risk

The attributable risk (AR) is an often-used measurement of risk, representing the amount of disease among exposed individuals attributed to the exposure. AR represents the maximum proportion of disease that can be attributed to the exposure and is the maximum proportion of disease that can possibly be prevented by eliminating the exposure (Green et al., 2011). AR can also be stated as the attributable proportion of risk, the proportion of the disease among exposed individuals that is associated with the exposure, and can be used as a measure of the public health impact of a causative factor. It assumes the incidence of disease in the unexposed group is the baseline and expected risk for that disease. It also assumes that if there is a difference between the incidence of disease in the two groups, the difference is due to the exposure. This is an appropriate tool to measure risk when a single risk factor or exposure is being considered but does not work well when there are multiple exposures to various agents. See Attributable Risk for more information on the calculation of attributable risk and an example.

Attributable Risk

Calculating Attributable Risk (AR)

Numerator: (Incidence in the exposed) – (Incidence in the unexposed)

Denominator: Incidence in the exposed

Example

A researcher is studying 300 individuals exposed to a potential carcinogen and 500 individuals who were not exposed to this potential carcinogen. After a five-year follow-up, 125 of the exposed individuals are diagnosed with the disease and 75 of the unexposed individuals are also diagnosed with the disease.

The incidence of disease in the exposed group is 125 individuals out of 300 $(100 \times \frac{125}{300} = 41.7 %)$ who contract the disease.

The incidence of disease in the unexposed group is 75 individuals out of 500 $(100 \times \frac{75}{500} = 15 %)$ who contract the disease.

To calculate the AR:

$\frac{(41.67 - 15)}{41.67} = 0.64$ or 64 percent, meaning that the proportion of disease that is attributable to the exposure is 64 percent. Attributable to does not equate to caused by, as this calculation is addressing association, not inferring causation.

Sources of Error

While reviewing epidemiological studies, the nurse must assess the strength of associations demonstrated in the study. A source of error can be a positive association discovered between an exposure and a health event when there is no true association. Another potential source of error occurs when no association is found between an exposure and health event when there is an association. A study may also find an association, but the strength of the association is greater or less than the actual association. These types of errors may be a result of chance, bias, or confounding. The nurse has a duty to examine each study for the possibility of these types of errors, which are explained in the following sections.

Chance

Chance refers to a random error that may occur within any study. The larger the sample size (the higher the number of participants), the less likelihood of a random error, but a large sample size does not eliminate the risk of a random error (Green et al., 2011). To assess for random error, epidemiologists use statistical significance and confidence intervals to permit an assessment of the study’s risk for random error (Green et al., 2011). Studies with statistically significant results are unlikely to be the result of random error. Confidence intervals provide the relative risk (or other risk measure) found in the study and an interval within which the risk would most likely fall if the study were repeated multiple times. If a 95 percent confidence interval is chosen, the range includes results expected 95 percent of the time if the samples for new studies were continually drawn from the same population.

A p value represents the probability that the observed association could be the result of random error (Green et al., 2011). A p value of 0.2 means there is a 20 percent chance the values found could have occurred by random error with no actual association present. Epidemiologists strive to minimize false positives by using p values that fall below a selected level—often 0.05—known as alpha, or the significance level, for the results of the study to be statistically significant (Green et al., 2011). The 0.05 alpha level means there is a 5 percent probability that the association found in the study would occur without an actual association, occurring by chance. The outcome will be deemed statistically significant if the observed alpha (p value) falls below the preselected significance level (Green et al., 2011). Note that the p value does not give the probability that the risk estimate in the study is correct; similarly, the confidence interval does not provide the range within which the true risk lies.

Bias

Bias, also referred to as a systematic error (tendency to underestimate or overestimate the value of a parameter), is another source of error in a study’s outcome, arising in the design or conduct of the study, data collection, or data analysis (Green et al., 2011). Researchers attempt to minimize bias through strong data collection protocols and overall study design. Bias results in a non-random error in a study result. When present, it may invalidate the results.

The two common categories of bias are selection bias and information bias. Selection bias results from an inappropriate method of selecting study participants, particularly the control group; the control group should be drawn from the same population as the participants with the health event or health exposure that is under study. Selection bias also occurs when participants decline to participate after agreeing to do so or when they drop out before study completion.

An example of selection bias occurred in a study looking at the effect of hormone replacement therapy (HRT) on coronary heart disease (CHD) in women. Several observational studies demonstrated a decrease in CHD in women using HRT. Later randomized controlled trials found the opposite effect—that HRT may increase the risk for CHD in this population. The difference in the findings was related to selection bias. Among the women in the observational studies, those taking HRT tended to be younger, more health conscious, and more physically active than those who were not using HRT. This resulted in a health-conscious bias as the observational studies represented the usual woman who initiated HRT (Catalogue of Bias Collaboration et al., 2017). Women who participated in the randomized control trials were older, more likely to be overweight or obese, and were using HRT for a much shorter duration than the women in the observational studies (Hodis & Mack, 2022).

Information bias results from a weakness in measuring the exposure or disease in the study group. In particular, in a case-control study, information bias is a consideration because the researcher depends on information from the past to determine exposure and disease and their relationship (Green et al., 2011). Research has demonstrated individuals with disease (cases) recall past exposure better than individuals without disease (control), creating a potential for recall bias. With case-control studies, researchers sometimes need to rely on interviews with surrogates when study subjects have died of the disease under investigation or are not well enough to be interviewed (Green et al., 2011). For example, in a case-control study looking at Alzheimer’s-type dementia, determining past exposures between the case and control groups could lead to information bias. In the cases, clients with Alzheimer’s-type dementia may have difficulty recalling past exposures compared to the control group, clients without Alzheimer’s-type dementia. Due to this recall bias, the presence of potential exposures as risk factors may be underreported in the case group, resulting in the researchers miscalculating the importance of these potential risk factors in disease development.

Confounding

Confounding is another type of error that may result in an incorrect causation or conclusion. This occurs when another factor (the confounder) is mistakenly identified as the agent associated with the outcome (Green et al., 2011). Confounding occurs when a confounder is both a risk factor for the disease and a factor associated with the exposure being investigated. This may result in an incorrect conclusion about causation, since the misidentified agent is not the true causal factor (Green et al., 2011). Confounding can be more of an issue with observational studies because in observational studies the participants are not assigned randomly to the comparison groups. Randomization helps to ensure that exposures, outside of the one being investigated, are evenly distributed between groups. Other techniques to limit confounding occur in the design stage with appropriate and thoughtful methods for selecting participants. If factors of age, gender, or certain lifestyle habits are potential confounders in a study, the investigators can limit the impact by selecting controls that match cases (Green et al., 2011). An example of confounding occurred when researchers were trying to link cigarette smoking to lung cancer. There was an established association that cigarette smokers had higher lung cancer rates, but this could have been due to other exposures or lifestyle factors and not the cigarette smoking. An argument could be made that individuals who smoke cigarettes are more likely to live in areas of higher pollution and that the pollution is causing the cancer. The confounding variable here is air pollution.

Establishing Causation

Causation refers to an increase in disease incidence among exposed subjects that would not have occurred if these subjects had not been exposed. Epidemiology cannot prove causation, but causation can be inferred from the data that epidemiologists and public health professionals analyze and interpret. In assessing causation, epidemiologists look for alternative explanations for the association, such as chance, bias, or confounding as mentioned previously. After this process has ruled out these sources of error, epidemiologists use the following nine factors—the Bradford Hill criteria, commonly referred to as Hill’s criteria for causation—to guide them in making judgments about causation (Table 12.3).

1	Temporal relationship	The exposure (causal factor) must occur prior to disease development.
2	Strength of the association	The stronger the association (the higher the relative risk), the more likely the relationship is causal.
3	Dose-response relationship	The greater the exposure to the causal factor, the higher the risk of disease.
4	Replication of the findings	If the relationship is causal, the study results will have been replicated in different populations with consistent study results.
5	Biological plausibility (coherence with existing knowledge)	The causal relationship should be consistent with current biologic information.
6	Consideration of alternative explanations	The causal relationship is reinforced when bias and confounding have been ruled out and other possible explanations have been considered.
7	Cessation of exposure	If a causal factor results in disease, the risk of the disease is decreased when the causal factor is removed.
8	Specificity of the association	The causal factor or exposure is associated with one disease.
9	Consistency with other knowledge	If the relationship is causal, the findings would fit and be consistent with other information and data.

Table 12.3 Hill Criteria for Causation (See Green et al., 2011.)