Spatial clusters of dominant lineages of uropathogenic Escherichia coli in a community dwelling patient population
Cheyenne Belmont, Pushkar Inamdar, Salma Shariff-Marco, Amina Gul, Alison J. Huang, Henry F. Chambers, Eva Raphael

TL;DR
This study finds that certain antibiotic-resistant E. coli strains causing urinary tract infections cluster in specific areas of a community, suggesting local spread rather than just antibiotic overuse.
Contribution
The study introduces a novel geospatial approach to identify clusters of AMR uropathogenic E. coli in community settings.
Findings
45% of UPEC isolates were identified as pandemic ST lineages, with ST131 being the most prevalent.
Significant spatial clusters were found for ST95, ST131, and ST69, indicating possible common-source outbreaks.
ST131 contributed the highest number of multidrug-resistant isolates.
Abstract
Antimicrobial resistance (AMR) is a major public health concern, especially in the clinical management of urinary tract infections (UTIs). While use of antimicrobial agents selects for AMR bacterial strains, it remains unclear if this factor alone drives the prevalence of UTIs caused by AMR uropathogenic Escherichia coli (UPEC) in community settings. Local prevalence of AMR UTIs may be largely influenced by spatial clusters of already-resistant sequence types within a community rather than by the initial selection of resistant strains by antimicrobial agents. The goal of this study is to examine geospatial clustering of UTI by common AMR UPEC ST lineages. We collected 551 UPEC isolates from patients receiving care in a San Francisco public healthcare system from April to September 2019. Isolates underwent multiplex PCR for rapid identification of pandemic UPEC STs (ST69, ST73, ST95,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —NIH/NIDDK
- —NIH/NIAID
- —National Center for Advancing Translational Sciences
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEscherichia coli research studies · Urinary Tract Infections Management · Gut microbiota and health
INTRODUCTION
Community-onset urinary tract infections (UTIs) are exceedingly common infections worldwide. An estimated 150 million people develop UTIs globally every year.^1^ These infections are associated with significant clinical and economic burdens to patients and healthcare systems.^1^ Antimicrobial resistance (AMR) is a critical challenge in the clinical management of UTIs. In 2019, UTIs were found to be the 4th leading cause of death associated with bacterial AMR.^2^ While an increase in multidrug-resistant (MDR) UTIs has long been recognized in hospital settings, evidence of an increase in the prevalence of MDR UTIs in community settings is concerning.^2,3^ It is unclear whether such an increase is due to antibiotic selective pressures alone or increase in prevalence and transmission of already resistant uropathogenic bacteria.
While Escherichia coli (E. coli) remains the primary cause of community-onset UTIs, this taxonomic group represents a complex and diverse range of organisms with significant variations between strains. From healthy human commensal flora to those associated with UTIs and gastrointestinal illnesses, the genetic diversity within this species is wide-ranging. Indeed, only 39.2% of predicted proteins are shared across enterohemorrhagic, uropathogenic, and commensal E. coli strains.^4^ Thus, meaningful epidemiological grouping is needed to understand how different sequence types (ST) may impact health. Molecular techniques, such as multilocus sequence typing (MLST), enable the rapid identification of new modes of transmission for infectious agents and facilitating the detection of strain-specific outbreaks within endemic disease patterns. This is especially true for UTIs which are often thought to represent sporadic events related to personal hygiene, sexual activity, or medical procedures like catheterization. However, through genotypic investigations using MLST, it has been found that about half of all community-onset UTI are caused by closely related E. coli lineages.^5–7^ This suggests possible common-source exposures to already resistant UPEC. Several studies have identified spatial clusters of AMR Enterobacteriaceae infections in the community, which may be representative of such common-source exposures.^8–11^ It is currently unknown if specific E. coli ST cluster geographically as well. If they do cluster, this will further support the hypothesis that seemingly sporadic AMR UTI events are the result of transmission dynamics and possibly associated to environmental factors, such as water quality, sanitation, and food exposures.
In this cross-sectional study, we collected clinical urine isolates routinely collected as part of medical care from April to September 2019. We identified E. coli lineages and investigated spatial patterning of prevalent E. coli lineages causing community-onset bacteriuria. By understanding how E. coli lineages causing community-onset bacteriuria are spatially distributed within a community, we can enhance our understanding of AMR UPEC transmission patterns and possibly identify possible local outbreaks and environmental exposures.
MATERIALS AND METHODS
Isolate collection
This is a cross-sectional study assessing the geographic distribution of uropathogenic E. coli STs. Our study is based in a large safety-net public hospital in San Francisco, the San Francisco General Hospital and the San Francisco Health Network, that serves an estimated 100,000 patients annually. The hospital microbiology laboratory conducts clinical testing for 15 associated clinics and a local chronic care facility, located in 14 San Francisco neighborhoods. We collected all Gram-negative bacterial isolates from clinical urine cultures sent for routine testing from April 2019 to September 2019 (N = 1007) processed at the hospital microbiology laboratory.
Electronic medical record (EMR) data, abstracted by the UCSF CTSI data abstraction services, was linked to clinical isolate data. Here, we include urine cultures from patients with suspected UTI and asymptomatic bacteriuria. We define community-onset bacteriuria episodes caused by E. coli as cases in which a urine culture was obtained from an outpatient clinic or emergency department, or within 48 hours of inpatient admission, and yielded an organism identified as E. coli.
The patient demographic characteristics and comorbidity data were extracted from the EMR included patient geocoded address as of 2019, age at time of culture, sex (male or female), self-reported race and ethnicity (Asian American or Pacific Islander, Black, Latine, White, or other/ declined to state), and preferred language spoken (Mandarin and Cantonese, English, Spanish, other or not stated). Comorbidities were evaluated based on the previous 5 years of EMR ICD-9 and ICD-10 codes and an unweighted Charleston Comorbidity Index (CCI) score was calculated.^12^ This study was approved by the UCSF Committee on Human Research (IRB number 19–27233) and the SFGH Research Committee.
Speciation and antibiotic susceptibility testing
Bacterial isolates were collected from the hospital microbiology laboratory on blood agar purity plates and we further sub-cultured isolated on MacConkey and Blood Agar Biplates. The biochemical profile of urine bacterial isolates was confirmed by the hospital microbiology laboratory based on current Clinical and Laboratory Standards Institute (CLSI) guidelines.^13^ Isolates were speciated with API 20E (bioMérieux, Durham, NC) for fermenters or API 20NE for non-enteric bacteria. Indole testing was conducted as secondary confirmation of presumptive E. coli in our laboratory. The hospital microbiology laboratory performs antimicrobial susceptibility testing (AST) using Microscan WalkAway Gram-negative panel and disk diffusion, with classification of resistance based on CLSI breakpoint standards.^13^ The microbiology laboratory classified extended-spectrum beta-lactamase producing E. coli (ESBL-E. coli) as an E. coli strain resistant to ceftazidime or cefotaxime and inhibited by clavulanic acid using broth microdilution, per 2016 CLSI guidelines.^13^ A multidrug resistant (MDR) isolate was defined by phenotypic resistance to at least 1 agent in ≥ 3 classes of antimicrobial agents used to treat UTI (β-lactams, fluoroquinolones, aminoglycosides, trimethoprim-sulfamethoxazole, and nitrofurantoin).^13^ Results reported as “intermediate resistance” were considered resistant in this study.
DNA extraction and sequence typing
All bacterial DNA was extracted by freeze-boil method. E. coli sequence types (STs) 69, 73, 95, and 131 were identified by a validated multiplex polymerase chain reaction (PCR) yielding PCR products of expected sizes (Table S1).^14^ Gel electrophoresis was used to distinguish unique band sizes to identify E. coli sequence types.^15^
Statistical and geospatial analysis
Key patient demographic and isolate characteristics were summarized with descriptive statistics, including frequencies and percentages for categorical data and mean values with maximum and minimum values for continuous data. All analyses were conducted in R 3.0.1. Charleston’s comorbidity index was calculated using the comorbidity package in R.^12^
All spatial analyses were conducted with ArcGIS Pro. Urine isolates from patients without San Francisco residential addresses or who did not meet the criteria of community-onset bacteriuria were excluded from analyses. We conducted separate spatial analyses to identify geographic clusters of the 4 major pandemic E. coli STs within San Francisco County. A kernel density heatmap was created to assess the community-onset bacteriuria patient distribution within San Francisco. The density of points at any given location is calculated by summing the contributions of all the kernel functions centered at data points in the vicinity of that location. Patient residential confidentiality was ensured by randomly substituting new point data within a fixed buffer diameter around the original address location. The potential for spatial heterogeneity or spatial patterns amongst each of the four lineages was assessed by Global Moran’s I based on Euclidean distance and inverse distance methodology, such that all patients have at least 1 neighbor. Global Moran’s I is a statistical measure used to determine the degree of spatial autocorrelation in a dataset. Spatial autocorrelation refers to the tendency of similar values to cluster together in geographic space. Global Moran’s I calculates a single value for an entire study area or dataset, which represents the overall degree of spatial clustering or dispersion in the dataset. The value of Global Moran’s I can range from − 1 (perfect dispersion) to + 1 (perfect clustering), with 0 indicating no spatial autocorrelation. A positive value of Global Moran’s I indicates that values of the variable being analyzed are clustered together in space, while a negative value indicates that they are dispersed.^16^
Cluster identification was conducted through Aselin Local Moran’s I, based on Euclidean distance method and fixed distances. Bond threshold was determined by iteratively testing distances beginning at the average distance between cases to maximize spatial autocorrelation. Local Moran’s I, also known as the local indicator of spatial association (LISA), is a statistical measure used to identify spatial clusters of high or low values for a specific variable within a study area or dataset. Local Moran’s I is a localized version of Global Moran’s I, which calculates the degree of spatial autocorrelation across the entire dataset. Local Moran’s I calculates a separate value for each individual unit or location within the study area, which represents the degree to which that unit is surrounded by other units with similar or dissimilar values. Like Global Moran’s I, Local Moran’s I can range from − 1 to + 1, with positive values indicating clustering of similar values and negative values indicating dispersion of similar values. Local Moran’s I is useful in identifying areas of high or low spatial clustering of a specific variable.^16^
Choropleth maps were generated by conducting a spatial join of cluster locations within San Francisco neighborhood boundaries defined in 2006 by the Mayor’s Office of Neighborhood Services and colored to visually display the number of high-high (HH) clusters and spatial low-low (LL) cluster of each dominant lineage within San Francisco.^17,18^ In examining the spatial distribution of a particular genetic UPEC ST lineage, a HH cluster would indicate a group of locations where the lineage is highly prevalent compared to other lineages including those that are not pandemic lineages, while a LL cluster would indicate a group of locations where the lineage is rare or absent compared to other lineages. Sensitivity analyses were conducted by adjusting for a false discovery rate within Local Moran’s I.
RESULTS
Patient demographic characteristics
Among the study population (N = 551), only 40 isolates (7%) came from male patients and the median patient age was 48 (Table 1). Most patients identified as Latine (36.3%) and the most common preferred languages were English (37.2%), followed by Spanish (25.4%). The average CCI value of all patients was 3.44, patients whose urine grew ST73 had the lowest CCI (2.50) and those whose urine grew ST69 had the highest CCI (3.7). Only 43 patients (7.8%) were diagnosed with a prior UTI within the 5 years of the current episode.
Prevalence of antimicrobial resistance by sequence type
Of the 551 UPEC isolates in the study, 247 (45%) were identified as pandemic lineages (Table 2). ST131 was the most common lineage representing 72 (29%) of the pandemic STs and contributing the majority of MDR isolates (85%) and ESBL isolates (81%). The most pan-susceptible lineage was ST95; 39 (56%) isolates from that lineage were susceptible to all tested antibiotics. Resistance to fluoroquinolones was rare in all lineages, except for ST131, where 47% of isolates demonstrated resistance to fluoroquinolones. The only lineage among pandemic lineages that demonstrated resistance to nitrofurantoin was ST131 (3%).
Spatial analyses
Of the 551 E. coli isolates, 10 patient addresses could not be geolocated and 19 did not meet community-onset bacteriuria inclusion criteria. Additionally, 32 patient addresses were located outside of San Francisco County and were excluded from the analysis. The distribution of patient addresses within San Francisco was visualized in a kernel density heat map (Fig. 1). Map areas of high density of patients with community-onset bacteriuria are represented by darker colors and areas of low density are represented by lighter colors. The outcome of the Global Moran’s I tests of ST95, ST131 and ST69 showed evidence of spatial heterogeneity, or spatial clusters (p = 0.001, p = 0.001, p < 0.001, respectively) within San Francisco County (Table 3). There was an uneven distribution of various concentrations of each ST within San Francisco, warranting further cluster resolution. Results of Local Moran’s I further discerned HH and LL clusters of ST95 and HH clusters of ST131 and ST69 (Table 3). When adjusting for false discovery rate, we detected two clusters of ST69 and no clusters of other STs.
A choropleth map (Fig. 2) exhibits the presence of HH clusters and LL clusters with red and blue color ramps displaying clusters of each pandemic lineage as detected by Local Moran’s I.
DISCUSSION
Community transmission of AMR UTI is a critical public health concern that warrants improved and local surveillance. Geographic information systems (GIS) have been commonly used to analyze and describe the geospatial distribution of many diseases in recent decades, especially infectious disease. Understanding spatial disease distribution and the potential of spatial clustering can provide insight into disease transmission, potential exposure sources, and disease reservoirs. Here, we leverage molecular biology data with EMR data to characterize the spatial distribution of uropathogenic E. coli STs, which may suggest patterns of disease transmission. Here, we found that 70% of bacteriuria episodes in a large safety-net healthcare system in San Francisco were caused by E. coli, with half belonging to 4 distinct lineages (ST95, ST69, ST131, and ST73). We identified spatial clusters of ST69, ST 95, and ST131, which indicates the possibility of common-source exposures to these lineages. Additionally, lineage ST131 was strongly associated with AMR, while ST95 was pan-susceptible, as reported in other studies.
To date, there is some evidence of spatial clustering of community-onset AMR UTI, but no study has established clustering of UPEC lineages. In Brazil and in the West of Ireland, neighborhood-level clusters of fluoroquinolone-resistant E. coli causing community-onset UTI were identified. Geospatial mapping of resistant E. coli isolates revealed that most AMR isolates clustered in urban regions.^19, 20^ These studies focused on how prescribing practices in these areas may be associated with these clusters of resistant phenotypes. However, our work is the first to demonstrate spatial clusters of already resistant lineages. This may play a major role in the distribution of community-onset AMR UTI independent of antibiotic prescribing patterns.
This study employed a cross-sectional study design which provides an opportunity to assess the prevalence of AMR E. coli causing bacteriuria and circulating sequence types. To our knowledge, this is the first report of spatial clusters of specific uropathogenic STs, demonstrating distinct variation in spatial patterns of ST prevalence. Possible transmission pathways include person-to-person exposures of UPEC, or dissemination of UPEC lineages from specific point source exposures. It may be that these bacteria are acquired from contaminated food products or other external sources within the built environment (e.g., water, environment) [18–24].^18,21–27^ A recent systematic review found that ESBL-producing E. coli belonging to the same lineages (ST131, ST69, ST73) were found in food sources, companion animals and water sources.^18^ Recently, a phylogenetic analysis and plasmid interrogation of ST131, recovered from poultry products, was found to be closely related to ST131 isolated from humans residing in the same region.^27^
Lineage ST131, which comprises 29% of our collection, has long been a lineage of concern, as it is strongly associated with ESBL phenotype and MDR. This is consistent prior reported that ST131 contributes 85% of MDR E. coli.^10^ Lineage ST95, conversely, has a documented propensity for remaining drug susceptible.^6–8^ In our collection, 56% of ST95 isolates were found to be pan-susceptible. Thus, the geographic distribution and dissemination of these lineages may have major implications for the transmission of AMR community-onset bacteriuria.
A major strength of this study is its ability to leverage linkages between bacterial genotype and patient EMR data to find evidence of lineage-specific geographic disease clusters. Our analysis relies on patient residential address to geolocate cases; however, a limitation of this study is its ability to capture disease distribution and transmission as it occurs in workplaces, schools, community venues, residences of close contact, and other settings. We examined the sensitivity of our Local Moran’s I results by additionally adjusting for a false discovery rate, which resulted in the loss of some, but not all clusters. The application of GIS methods within molecular epidemiological datasets is often limited by the restriction of feasible sample sizes. We believe that the decrease in clusters identified from 7 to 2 is likely due to small sample size, but, overall, the results of the Local Moran’s I analyses demonstrate that our findings are robust. Another limitation is that spatial analyses were restricted to patients with residential addresses and did not include those experiencing homelessness. Lastly, our analyses are limited to urine cultures sent routinely for testing, there may be some selection bias present due to the clinical presentation of the patient and the individual practice of the clinician.
CONCLUSION
This investigation harnesses molecular and spatial epidemiology methods to identify spatial clusters of uropathogenic bacterial lineages ST69, ST95, and ST131. Here, bacteriuria cases exhibited spatial clustering throughout San Francisco. This highlights the potential of AMR lineages, like ST131, to occur in outbreaks outside of hospital settings. Future research should prioritize investigation of spatial heterogeneity within UPEC lineages causing community-onset bacteriuria alongside other potential community level risk factors - particularly those related to built-environments and exposures other than antibiotics which may contribute to the increasing prevalence of AMR UTI.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ozturk R, Murt A. Epidemiology of urological infections: a global burden. World J Urol. (2020) 38:2669–79. 10.1007/s 00345-019-03071-4].31925549 · doi ↗ · pubmed ↗
- 2Antimicrobial Resistance Collaborators. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet. 2022 Feb 12;399(10325):629–655. doi: 10.1016/S 0140-6736(21)02724-0. Epub 2022 Jan 19. Erratum in: Lancet. 2022 Oct 1;400(10358):1102.35065702 PMC 8841637 · doi ↗ · pubmed ↗
- 3Flores-Mireles AL, Walker JN, Caparon M, Hultgren SJ. 2015. Urinary tract infections: epidemiology, mechanisms of infection and treatment options. Nat Rev Microbiol 13:269–284.25853778 10.1038/nrmicro 3432 PMC 4457377 · doi ↗ · pubmed ↗
- 4Kaper J., Nataro J. & Mobley H. Pathogenic Escherichia coli. Nat Rev Microbiol 2, 123–140 (2004). 10.1038/nrmicro 81815040260 · doi ↗ · pubmed ↗
- 5Medina M, Castillo-Pino E. An introduction to the epidemiology and burden of urinary tract infections. Ther Adv Urol. 2019 May 2;11:1756287219832172. doi: 10.1177/1756287219832172.31105774 PMC 6502976 · doi ↗ · pubmed ↗
- 6Riley LW. Pandemic lineages of extraintestinal pathogenic Escherichia coli. Clin Microbiol Infect [Internet]. 2014 May 1;20(5):380–90.24766445 10.1111/1469-0691.12646 · doi ↗ · pubmed ↗
- 7Yamaji R, Rubin J, Thys E, Friedman CR, Riley LW. Persistent Pandemic Lineages of Uropathogenic Escherichia coli in a College Community from 1999 to 2017. Diekema DJ, editor. J Clin Microbiol [Internet]. 2018 Feb 7;56(4):e 01834–17.29436416 10.1128/JCM.01834-17PMC 5869836 · doi ↗ · pubmed ↗
- 8Galvin S, Bergin N, Hennessy R, Exploratory Spatial Mapping of the Occurrence of Antimicrobial Resistance in E. coli in the Community. Antibiotics (Basel). 2013;2(3):328–338.27029306 10.3390/antibiotics 2030328 PMC 4790267 · doi ↗ · pubmed ↗
