1. The Epidemiology of Tuberculosis in San
Francisco -- A Population-Based Study Using
Conventional and Molecular Methods
Peter M. Small, Philip C. Hopewell, Samir P. Singh, Antonio Paz, Julie Parsonnet, Delaney C.
Ruston, Gisela F. Schecter, Charles L. Daley, and Gary K. Schoolnik
N Engl J Med 1994; 330:1703-1709June 16, 1994DOI: 10.1056/NEJM199406163302402
Share:
Abstract
Article
References
Citing Articles (295)
Letters
Tuberculosis and its recent resurgence are predominantly urban phenomena in the United
States, where case rates in large cities are almost two and a half times higher than the
national average1. A combination of biologic and social factors has been postulated to
account for this situation. In many cities, the number of persons who are
immunosuppressed by infection with the human immunodeficiency virus (HIV) and the
prevalence of drug-resistant tuberculosis have increased in the face of deteriorating
socioeconomic conditions and public systems of health care delivery2. As a result,
important changes seem to have occurred in the patterns of Mycobacterium tuberculosis
transmission. In particular, the long-held assumption that only 10 percent of tuberculosis
cases are the result of recent infection needs to be reconsidered3.
The combination of molecular fingerprinting of M. tuberculosis strains and conventional
epidemiologic investigation has improved understanding of the transmission of
tuberculosis. Molecular fingerprinting by restriction-fragment-length polymorphism
(RFLP) analysis yields a unique, strain-specific pattern of bands (the “fingerprint”) that is
stable for at least two years4-8. Comparison of M. tuberculosis fingerprints from
tuberculosis strains isolated during circumscribed outbreaks has demonstrated matching
patterns among persons who were clearly infected from a common source4,8-19. By
showing that patients with no obvious epidemiologic relation are infected with the same
strain, molecular fingerprinting has revealed that M. tuberculosis can be transmitted
during brief contact between persons who do not live or work together18,20,21. Taken
together, these studies suggest that patients with the same M. tuberculosis RFLP pattern
constitute an epidemiologically linked cluster. Furthermore, because tuberculosis
developed during a relatively short period in patients in a cluster, clustering indicates
recent infection and rapid progression to clinical illness22.
2. We conducted a population-based molecular epidemiologic study of tuberculosis in San
Francisco. In addition to providing an estimate of the incidence of tuberculosis that
results from recently transmitted infection, we identified some of the risk factors for the
transmission of M. tuberculosis. Our results suggest that current tuberculosis-control
strategies have important limitations in contemporary urban environments.
Methods
Patient Identification and Routine Data Collection
The population studied included all patients with tuberculosis who were reported to the
San Francisco Department of Public Health, Division of Tuberculosis Control, between
January 1, 1991, and December 31, 1992. The routine demographic data collected
included age, sex, race or ethnicity, country of birth, number of years of residency in the
United States, and address at the time of diagnosis. Specific information concerning
tuberculosis included the date of diagnosis, site or sites of disease, results of chest
radiographs, and results of microbiologic studies.
The registries for tuberculosis and the acquired immunodeficiency syndrome (AIDS)
maintained by the San Francisco Department of Public Health were cross-matched to
identify all patients reported to have both tuberculosis and AIDS as of September 1993.
Confidentiality was ensured by having health department personnel remove all
identifying information before the data analysis. The subjects' socioeconomic status was
estimated by matching patients' addresses at the time of diagnosis to census-tract data
(including indexes of unemployment, income, poverty level, education level, crowding,
immigration status, and racial or ethnic distribution). Census-tract information was not
included for 14 homeless persons.
Collection of M. tuberculosis Isolates and RFLP Analysis
Lowenstein-Jensen slant cultures used for mycobacterial identification and drug-
susceptibility testing were prospectively collected for all microbiologically confirmed
new cases of tuberculosis in San Francisco. RFLP analysis was performed with an
internationally standardized method with internal molecular-weight standards23. The
resulting autoradiographs were compared with the Bio Image Whole Band Analyzer,
version 3.0 (Millipore, Ann Arbor, Mich.). All lanes that were found by computer
analysis to have similar patterns were compared visually and classified as having
matching RFLP patterns if the number and molecular weights of the bands were identical.
Microbiology records were scrutinized for all patients who had only a single positive
culture for which a smear for acid-fast bacilli was negative. These cultures were
considered to be false positive if they were processed in the microbiology laboratory on
the same day as a specimen with a positive smear from another patient with the same
RFLP pattern24.
Epidemiologic Investigations
3. For all patients treated by the Department of Tuberculosis Control, an investigation of
contacts was conducted by trained, multilingual disease-control investigators using
standard methods25. For patients whose care was not managed by the Division of
Tuberculosis Control, contact investigation was conducted either by the treating
physician or by Tuberculosis Control personnel. In addition to the routine contact
investigation, selected groups of patients infected with organisms with identical RFLP
patterns were studied further by a more intensive review of the Division of Tuberculosis
Control records. For patients in the largest cluster, all available clinic and hospital records
were reviewed and the patients were interviewed.
Statistical Analysis
Data were entered and analyzed with FoxPro 2.5 (Microsoft, Redmond, Wash.), EpiInfo
(Centers for Disease Control and Prevention, Atlanta), Egret (Statistics and Epidemiology
Research Corporation, Seattle), and PC SAS (SAS Institute, Cary, N.C.) computer
programs. A cluster was defined as two or more patients with identical RFLP patterns.
Patients with unmatched RFLP patterns were considered nonclustered.
Student's t-test and the chi-square test were used to assess univariate risk factors for being
in a cluster. Risk factors for clustering identified by univariate analysis were then
included in multivariate logistic-regression models, with clustered and nonclustered as
the dependent outcomes. Because age appeared to be related to clustering in a nonlinear
fashion, with a marked decrease in risk at the age of 60 years, age was categorized as
either less than 60 years or 60 years or older. Odds ratios were calculated from regression
estimates based on the chi-squared approximation for the likelihood-ratio statistic; 95
percent confidence intervals were based on the estimated variance of the regression
coefficients26. The likelihood-ratio statistics were also used to contrast the relative
goodness of fit between competing logistic-regression models. Tests for interaction were
conducted for all likely interacting variables. Age, sex, and factors that remained
significant after adjustment for related variables were included in a final model.
Results
Patient Population and RFLP Patterns Obtained
During 1991 and 1992, 688 cases of tuberculosis were reported to the Division of
Tuberculosis Control, 585 of which were confirmed by the isolation of M. tuberculosis.
Viable isolates of M. tuberculosis were not available from 89 patients. These patients
were similar to the 496 patients included in this study except that they were slightly older
(median age, 46 years; P = 0.02) and more likely to be Asian (RFLP data were not
available on 20 percent of Asian patients, P = 0.003).
Nine of the 496 patients were excluded from further study because their culture results
fulfilled the criteria for laboratory cross-contamination. RFLP analysis of the strains
isolated from the remaining 487 patients identified 326 distinct patterns, 282 of which
were found in only 1 patient.
4. Previously published molecular biologic and epidemiologic studies have concluded that a
clonal relation cannot be inferred to exist between strains of M. tuberculosis that have
only one copy of IS611027,28. Accordingly, the 12 M. tuberculosis strains with only one
copy of IS6110 were not included in the epidemiologic analysis. Consequently, the
statistical analysis was based on 473 patients (Table 1Table 1 Analysis of
Risk Factors for Clustering in 473 San Francisco Patients.) and 324 RFLP patterns, of
which 44 were found to be shared by at least 2 patients (i.e., they were in clusters). The
44 shared RFLP patterns were obtained from 191 patients (Table 2Table 2
Cluster Sizes and the Number of Clusters among 473 San Francisco
Patients with Tuberculosis.). The RFLP patterns of strains isolated from clusters
containing three or more patients are shown in Figure 1Figure 1 Results
of RFLP Analysis of M. tuberculosis Strains Isolated from Three or More Patients.. Thus,
191 of the 473 patients (40 percent) were in 1 of the 44 clusters; the clusters ranged in
size from 2 to 30 patients.
Identification of Risk Factors
To identify risk factors for recent infection with M. tuberculosis, the 191 patients in
clusters were compared with the 282 patients not in clusters. Univariate analysis (Table
1) showed that patients in clusters were more likely to be male, young (mean age, 40.8
years, vs. 48.4 years for patients not in clusters; P<0.001), black or Hispanic, and born in
the United States; to have AIDS; to have received care at the Division of Tuberculosis
Control clinic; and to reside in a census tract with a poverty rate of more than 20 percent.
In contrast, a history of tuberculosis and Asian race were associated with a significantly
decreased risk of being in a cluster. Infection with drug-resistant M. tuberculosis and the
level of crowding and education in the census tract were not associated with clustering
(data not shown).
Multivariate analysis of the risk factors for clustering revealed significant differences
between younger and older patients (Table 3Table 3 Analysis of Risk
5. Factors for Clustering after Adjustment for Sex and Age at Diagnosis.). For patients
younger than 60 years, risk factors for clustering included Hispanic ethnicity (odds ratio,
3.3; P = 0.02), black race (odds ratio, 2.3; P = 0.02), birth in the United States (odds ratio,
5.8; P<0.001), and AIDS (odds ratio, 1.8; P = 0.04). In contrast, for patients 60 years of
age or older, the only significant risk factor was having been cared for at the Division of
Tuberculosis Control clinic (odds ratio, 5.7; P = 0.008). In the older age group, Asian
race was again associated with a reduced risk of being in a cluster.
Epidemiologic Investigation of RFLP Clusters
Intensive epidemiologic investigations were conducted of the 3 largest clusters and the 20
clusters composed of only two patients. Thus, 23 of the 44 clusters (52 percent), or 108 of
the 191 patients (56 percent) with isolates with identical RFLP patterns, were included in
this analysis.
Routine investigation had established that 12 patients in the largest cluster (Table 2) were
living in or employed by a residential facility for patients with AIDS12. Our RFLP
analysis identified an additional 18 patients with isolates with the same fingerprint who
were not previously known to have any association with the facility. Seven of these
patients were available for interview, eight had died, and three could not be located or
refused to be interviewed.
The apparent index patient in this cluster was a 38-year-old white man with AIDS who
was receiving general assistance, was not compliant with antituberculous therapy, and
had had positive sputum smears for approximately six months. Specific transmission
links could be established among nine of the patients who were not associated with the
residential facility (Figure 2Figure 2 Transmission Links Identified
between Patients with Isolates in the Largest Cluster.): two named one another as
contacts, three were on the same hospital ward, and four were in the same general
medical clinic at a time when it was reasonable to assume that transmission had occurred.
Although seven additional patients were homeless, homosexual, or substance abusers,
they were not otherwise linked epidemiologically. Three patients had no discernible
connection with any of the other patients.
The second-largest cluster contained 23 patients who were primarily young (average age,
33 years), born in the United States (18 patients), and male (19 patients); 13 had AIDS,
and 8 were substance abusers. The index patient was a 28-year-old white HIV-infected
transsexual man who was an intravenous drug user and a prostitute. He had been found to
have tuberculosis, with a positive sputum smear, shortly after moving to San Francisco
and was noncompliant with therapy. The M. tuberculosis strain found in this patient was
next isolated from four other young homeless HIV-infected intravenous drug users over a
three-month period and subsequently from a more diverse group of patients.
6. The apparent index patient in the third-largest cluster (15 patients) was a 36-year-old
HIV-seronegative black alcoholic man with cavitary pulmonary tuberculosis. He
frequently used public facilities, including homeless shelters, detoxification centers,
public clinics, and hospitals. This patient also was noncompliant with therapy and had
had positive sputum smears for nine months. Most of the other patients in this cluster
were also black (12 patients) and alcoholic (8 patients); only 5 of the 15 patients were
recorded as having AIDS.
Efficacy of Contact Tracing
A conventional investigation of the patients' contacts identified connections among only
19 of the 191 patients (10 percent) found to be connected by RFLP analysis. Of the three
largest clusters, contact tracing identified only the outbreak in the AIDS facility.
To examine further the relation between the patients' characteristics and the accuracy of
conventional contact tracing, we studied the 20 clusters that contained only two patients
each. Conventional contact tracing conducted before the RFLP results were available
predicted transmission in only four of these clusters, all of them involving contact
between an older patient who presumably had reactivated tuberculosis and a younger
person in a traditional household setting. No instances of transmission between
immigrants, transients, or patients with AIDS were predicted from the contact
investigation.
Discussion
We used a systematic, population-based RFLP analysis of M. tuberculosis isolates in
conjunction with conventional epidemiologic methods to describe the contemporary
pattern of tuberculosis transmission in San Francisco. The information produced by this
approach is consistent with that yielded by traditional reporting practices in that it
enumerates and characterizes the cases that occurred during a given period in a single
public health jurisdiction. However, our data provide considerably more information
about tuberculosis transmission in this urban area, including evidence that an important
factor in the resurgence of tuberculosis, despite an efficient tuberculosis-control program,
is the ongoing transmission of a few strains of M. tuberculosis in specific subgroups of
the population.
The use of RFLP analysis to identify the pathways of tuberculosis transmission within a
community is based on the premise that epidemiologically unrelated cases will have
occurred as a result of the reactivation of latent infection and thus have unique RFLP
patterns, whereas cases that are linked as a consequence of recent infection will have the
same patterns (i.e., appear in a defined cluster). In this study, the first contention is
supported by the vast diversity of RFLP patterns in San Francisco: 326 distinct patterns
among the 487 strains analyzed. The second is supported by the congruence of the
molecular-fingerprinting data and results of the epidemiologic study of tuberculosis
outbreaks4,8-21.
7. We found that 191 of the 473 patients (40 percent) had 1 of 44 clustered RFLP patterns
and thus may have been epidemiologically linked. Assuming that a typical cluster of n
persons comprises one index patient with reactivated disease and n - 1 patients with
recently acquired disease, we estimate that at least 31 percent (191 - 44) of the 473 cases
were due to recent infection that had progressed to active disease during the two-year
study period. Because RFLP analysis can only be used to analyze microbiologically
confirmed cases, patients who became infected but whose infection remained latent
during the course of the study were not identified. Reactivation of infection in these
latently infected persons will continue to produce overt disease for decades. As a result,
the true magnitude of the increased burden of tuberculosis due to recent M. tuberculosis
infection in San Francisco is probably greater than our estimate of 31 percent.
A principal objective of this study was the identification of risk factors for recent
infection. Because we focused only on cases reported during a two-year period, our
analysis of risk factors encompassed only the subgroup of recently infected patients
whose infection progressed to active disease during this interval. As a result,
epidemiologic risk factors for transmission are necessarily combined with biologic risk
factors that are associated with rapid progression.
For patients less than 60 years of age, a diagnosis of AIDS, birth in the United States,
black race, and Hispanic ethnicity were found by multivariate analysis to be significant,
independent risk factors. HIV seropositivity itself was not a significant risk factor for
clustering in the patients for whom HIV serologic data were available (data not shown),
probably reflecting the importance of the degree of immunosuppression in the
development of tuberculosis. In contrast, patients with AIDS and severe
immunosuppression are at increased risk of being in a cluster. This probably reflects the
combined effects of a shortened interval between infection and active disease and the
tendency for patients with AIDS to be brought together in common medical or living
facilities.
Being born in the United States also might act as a risk factor through a biologic
mechanism, since most such persons will have a negative tuberculin test and thus lack the
relative immunity associated with latent tuberculosis. In younger subjects, birth outside
the United States protected against newly acquired infection. Even after adjustment for
race and ethnicity, the immigrant population was significantly more likely to have
reactivated disease (and was less likely to be in a cluster) than persons born in the United
States. This may reflect the high rate of latent tuberculosis infection in children born in
developing countries. If so, our results suggest that childhood infection both protects
immigrants from new infection and places them at risk for reactivation.
Strikingly different risk factors were found for persons 60 years of age or older. In this
age group, treatment at the municipal tuberculosis clinic was the only variable identified
as a risk factor for clustering. Because most patients cared for in this clinic have already
been given a diagnosis of tuberculosis, the clinic itself is unlikely to have been a locus for
transmission. Instead, its use may be a proxy for the use of other social and medical
facilities where transmission may have occurred. In the older age group, being Asian was
8. a significant negative risk factor for clustering, probably because many older patients
have latent infection that may become reactivated.
Epidemiologic investigation of the three largest clusters reconfirms that a single patient
with highly infectious disease can have a major impact on urban programs of tuberculosis
control. Each of the index patients had positive smears and was poorly compliant in
taking the prescribed antimicrobial therapy. In the largest cluster the putative index
patient, one of the few patients not treated successfully by the San Francisco Tuberculosis
Control Program, apparently infected 29 additional patients. Thus, this one patient
accounted for 6 percent of the cases evaluated in San Francisco during the study period.
Data collected by the Centers for Disease Control and Prevention show that such
noncompliant patients are uncommon in San Francisco, where during the study period at
least 95 percent of patients completed their regimens of antituberculous drugs. The
cumulative contribution of such persons may be much greater in areas where compliance
rates are lower and multidrug-resistant tuberculosis is prevalent.
Overall, conventional contact tracing, conducted by an efficient tuberculosis-control
program, identified only 10 percent of the patients in clusters. This low level of efficacy
is best explained by the overrepresentation in clusters of unemployed and homeless
persons, who may have become infected in settings determined primarily by lifestyle and
by social subgroups. Contacts of this kind may have been multiple, transient, and difficult
to reconstruct by routine tracing techniques. The overrepresentation of patients with
AIDS may also have reduced the efficacy of contact tracing in this group, since the
presumably increased susceptibility of such persons to tuberculosis may have permitted
transmission to occur in settings where exposure is neither prolonged nor intense. Casual
transmission of this kind is hard to detect with current techniques of contact tracing.
This study has three major implications for urban tuberculosis control. First, because
more cases of tuberculosis are arising as a result of recent infection with M. tuberculosis
than has been heretofore appreciated, increased emphasis should be placed on the
identification of sites of transmission and the application of environmental controls.
Second, because a single infectious patient may have devastating effects on tuberculosis
control, the treatment of patients with infectious tuberculosis must be prompt and
effective. Third, because only 10 percent of the patients in clusters were identified by a
conventional investigation of contacts, novel approaches to contact tracing may need to
be developed and targeted to specific populations.
Supported in part by the Howard Hughes Medical Institute, grants from the National
Institutes of Health (K08 AI01137-01 and R01 AI34238-01), and a grant from the
Centers for Disease Control and Prevention (U52-CCU 900454).
We are indebted to the personnel of the San Francisco Department of Public Health
Division of Tuberculosis Control, whose high quality of service and cooperation have
made this work possible; to Aimee LaPerriere-Hunt for diligent research assistance; to
Arthur Back (deceased), Anna Babst of the San Francisco Public Health Laboratory,
Arthur Reingold, Gretchen Anderson, and the Western Consortium for Public Health,
9. Bacterial and Mycotic Surveillance Project for assistance with the collection of M.
tuberculosis; to Kevan Gross and Eric Preston for essential assistance with computer-
software design; to Karl Reich for many thoughtful discussions regarding the molecular
biology of M. tuberculosis; to Lorene Nelson and Jerry Halpern of the Stanford
University Department of Health Research and Policy for important advice about
statistical analysis; and to Dr. Nancy Krieger, Kaiser Permanente Division of Research,
Oakland, Calif., for San Francisco County census-tract information.