Clinical progression parameters associated with SARS-CoV-2, influenza, and respiratory syncytial virus infections in a large US integrated healthcare population
Noah T. Parker, Vennis Hong, Gregg S. Davis, Magdalena Pomichowski, Iris A. Reyes, Fagen Xie, Nicola F. Mueller, Isabel Rodriguez-Barraquer, Sara Y. Tartof, Joseph A. Lewnard

TL;DR
This study analyzed healthcare data to track how SARS-CoV-2, influenza, and RSV infections progress through different levels of care, from virtual visits to hospitalization and death.
Contribution
The study provides detailed progression probabilities and time estimates for respiratory infections across care levels using a large healthcare dataset.
Findings
RSV had the highest proportion of cases resulting in inpatient admission, ventilation, or death (33.8%) compared to SARS-CoV-2 (7.9%) and influenza (5.8%).
Older age and more comorbidities were associated with higher care acuity levels for all three viruses.
Median hospital stays were similar across the three infections, ranging from 4.0 to 4.3 days for admitted cases.
Abstract
Mathematical and computational models are often used to forecast respiratory infectious disease burden, including to inform healthcare capacity. We aimed to characterize pathways of clinical progression associated with SARS-CoV-2, influenza, and respiratory syncytial virus (RSV) infections using data from patients aged 0 to >90 years in an integrated healthcare system, whose encounters were monitored across all levels of acuity spanning virtual, ambulatory, and inpatient care settings. Using parametric survival models, we estimated probabilities of progression and distributions of time to progression from each setting to all higher-acuity settings on a cascade encompassing the following classes of events or encounters: symptoms onset; diagnostic testing; telehealth or other virtual care appointment; outpatient physician office visit; urgent care presentation; emergency department…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig 1
Fig 2
Fig 3
Fig 4- —http://dx.doi.org/10.13039/100025287Center for Forecasting and Outbreak Analytics
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRespiratory viral infections research · COVID-19 epidemiological studies · COVID-19 Clinical Research Studies
Introduction
Acute respiratory illnesses (ARIs) caused by SARS-CoV-2, influenza, and respiratory syncytial virus (RSV) are important contributors to morbidity and mortality in the United States and globally [1–3]. Anticipating healthcare utilization associated with ARIs is an objective of both public health agencies and healthcare delivery organizations. Mathematical and computational models used to forecast ARI burden are often trained using data from either syndromic surveillance or reported cases, hospital admissions, and deaths associated with each infection [4]. Such models employ diverse frameworks, often including mechanistic approaches simulating the natural history and transmission dynamics of infection [5], multiplier approaches anticipating care utilization needs at differing levels of acuity [6], and forecasting or nowcasting approaches based on time series [7–9]. To inform capacity planning—including decisions around the allocation of personnel, space, medications, laboratory infrastructure, and other resources—such models require realistic parameters concerning the likelihood and time course of cases’ healthcare utilization [10].
Despite this need, few real-world data sources address clinical care trajectories associated with ARIs due to SARS-CoV-2, influenza, and RSV. While parameters such as the proportion of SARS-CoV-2 infections resulting in hospital admission or death [11,12] and durations of hospital stay [13,14] were estimated in numerous settings during the early phases of the COVID-19 pandemic, fewer studies have addressed utilization patterns in lower-acuity healthcare settings such as ambulatory clinics and emergency departments, where the greatest numbers of all medically-attended cases receive care. Moreover, updated epidemiological parameter estimates for recent SARS-CoV-2 variants and in populations with widespread immunity are not widely available. These challenges are equally pronounced in efforts to model seasonal influenza and RSV. Although some studies have aimed to characterize reporting pyramids addressing symptomatic or medically attended, hospitalized, and fatal influenza cases for contexts of seasonal [15] and pandemic influenza [6,16], these studies have drawn on data from disparate sources and settings, and do not address time-to-event parameters that are likewise critical to forecasting.
We aimed to characterize pathways of clinical progression during ARIs associated with SARS-CoV-2, influenza, and RSV. We analyzed data from patients enrolled in capitated, managed care plans within an integrated healthcare system in southern California. This rich data source allowed us to monitor patient encounters across all levels of acuity spanning virtual, ambulatory, and inpatient care settings. We quantified how patients progress through the different settings of care using parametric survival models among all ascertained infections. The outputs of this analysis provide a basis for modeling clinical burden and healthcare system impacts of SARS-CoV-2, influenza, and RSV infections.
Methods
Ethics statement
This study was reviewed and approved by the Kaiser Permanente Southern California institutional review board, which granted a waiver of informed consent for retrospective analysis of EHR data.
Overview of the modeling approach
We defined a cascade of clinical progression wherein we aimed to estimate (a) the probability that an individual observed at any state along this cascade would progress to a higher-acuity state, and (b) the distributions of times to progression from lower-acuity to higher-acuity states (Figs 1; S1 File). States were characterized as the highest-acuity settings where individuals had received care for ARI at a given point during their infection. We defined the states in order of increasing acuity as: any infection or symptomatic infection without associated healthcare utilization (besides testing); telehealth or other virtual care appointment; outpatient physician office visit; urgent care presentation; emergency department presentation; hospital admission; mechanical ventilation; and death. Our analyses account for the fact that individuals may progress from lower- to higher-acuity states without being intercepted via healthcare encounters at intermediate levels between these origin and destination states. As ongoing receipt of care in lower-acuity settings does not provide a basis for inferring recovery, our analysis considers “forward” transitions to care in higher-acuity settings only.
Clinical care cascade for acute respiratory illnesses associated with SARS-CoV-2, influenza, and RSV infections.We illustrate cumulative distribution functions from best-fitting models (defined by Akaike information criterion values) for times from symptom onset to progression to (or beyond) each acuity threshold. Plus signs (+) next to states indicate receiving care at the indicated level or a higher-acuity level of care. Right-hand panels illustrate cumulative distribution functions for rare outcomes (mechanical ventilation and death).
We fit parametric survival models to jointly estimate the probabilities of progression and the distributions of time to progression from each state to all higher-acuity states on the cascade. We modeled three classes of transitions aiming to inform distinct forecasting applications. First, we modeled an individual’s most proximal progression event from each originating state on the cascade, aiming to characterize typical pathways of care utilization (e.g., the probability and time-to-event for the first hospital admission following emergency department presentation). Secondly, we modeled an individual’s total probability of progression to each state or more severe states from all lower-acuity originating states on the cascade, aiming to inform projections of total demand at each level of acuity within a patient cohort observed at any point in time (e.g., the probability and time-to-event for care receipt at an urgent care facility following a virtual care appointment). Lastly, we aimed to estimate durations of hospital stay following inpatient admission. We explored the association of progression risk and progression rates with individual epidemiologic and demographic characteristics by including covariates in survival models.
We share parameters of all fitted models for re-use within the supplementary materials and via an online repository (https://github.com/ntparker3/Resp_params).
Setting
We used data from healthcare encounters among members of Kaiser Permanente Southern California (KPSC), an integrated healthcare system providing care across virtual, outpatient, and inpatient settings to roughly 4.7 million individuals throughout southern California during the study period. Members of KPSC enroll through a combination of employer-sponsored, pre-paid, and government-subsidized insurance plans, and broadly reflect the socioeconomic and racial and ethnic diversity of the area’s population [17,18]. Electronic health records (EHRs) capture clinical notes, diagnoses, laboratory results, and prescriptions for care received at KPSC facilities, while insurance claims capture out-of-network care, enabling near-complete ascertainment of healthcare delivery for members. Testing for the studied viral pathogens is conducted by in-house clinical laboratories, with results linked to patient EHR data via unique patient identification numbers. Medically-supervised deaths are tracked via EHR, while medically-unsupervised deaths are reconciled with health plan administrative and clinical databases, member proxy reporting, Social Security Administration vital status data, and California death certificates.
Study population and episode definition
We conducted event-level analyses of ARIs associated with SARS-CoV-2, influenza, and RSV among individuals of any age who tested positive for each pathogen by molecular or antigen-based assays in any clinical setting between 1 April 2023 and 31 March 2024. We selected this study period to ensure results were not impacted by disruptions in routine care delivery associated with earlier emergency phases of the COVID-19 pandemic. Additionally, screening for SARS-CoV-2 infection at the point of hospital admission or other healthcare encounters was no longer undertaken by KPSC during the study period. While co-infections were exceptionally rare, we allowed overlapping episode periods to contribute to observations for each unique virus identified. We limited the study population to individuals with ≥1 year of continuous enrollment in a Kaiser health plan before their index test (allowing for enrollment gaps up to 45 days in length, as captured from membership enrollment and disenrollment dates) to ensure accurate characterization of individuals’ baseline health status from prior-year utilization. For children aged <1 year, this requirement applied to parents.
We defined index tests for each ARI episode as the date of specimen collection associated with the first positive test for each pathogen during the study period. Symptoms data (presence of and onset dates for fever, cough, headache, fatigue, dyspnea, chills, sore throat, myalgia, anosmia, diarrhea, vomiting or nausea, and abdominal pain within 14 days before testing) were solicited at the point of testing for all individuals who received SARS-CoV-2 tests during the study period. Symptoms were recorded for most ARI episodes, as nearly all individuals who received influenza or RSV tests within KPSC were previously or concurrently tested for SARS-CoV-2. We supplemented structured data from symptoms questionnaires with searches of free-text EHR fields via a previously described natural language processing (NLP) algorithm [19] to characterize the presence and onset times of symptoms using all available information. We defined the date of symptoms onset as the earliest recorded date within 14 days before to 30 days after testing for each episode.
We characterized ARI episodes using data from all healthcare encounters occurring between the date of symptoms onset (or up to seven days before the date of testing, if symptoms were not recorded at the time of testing) and 30 days after the testing date. We defined dates of progression to each state as the first date at which individuals received care in the associated clinical setting, including at least one ARI-related diagnosis code (S1 Table in S1 File). We excluded any care utilization events where ARI codes were not assigned to ensure that healthcare encounters unrelated to an ongoing ARI episode were not interpreted as indicators of disease progression.
Statistical analysis
We fit parametric survival models using the flexsurv package [20] in R (version 4.4.2; R Foundation for Statistical Computing, Vienna, Austria). We estimated parameters corresponding to assumptions that times-to-event followed exponential, Weibull, Gompertz, Gamma, generalized Gamma, and log normal distributions for all transitions. The modeled distributions invoke differing assumptions about underlying rate or processes of progression across clinical care settings. The exponential distribution assumes events occur independently with a constant rate, an assumption that the Weibull and Gompertz distributions relax by allowing rates to vary over time. The Gamma distribution defines times to progression as the sum of exponentially-distributed event times, which in our application may correspond to underlying disease progression events prompting receipt of higher-acuity care, while the generalized Gamma combines the refinements of both the Weibull and Gamma distributions. The log normal distribution, in contrast, lacks a similar mechanistic interpretation but may provide a good approximation to observed event time distributions. For brevity, we describe results for models with the lowest Akaike information criterion (AIC) values for each transition in this manuscript, and present parameter estimates for all distributions evaluated within the accompanying code base (https://github.com/ntparker3/Resp_params). Practitioners, however, should consider the mechanistic assumptions underlying differing time-to-event distributions alongside or in lieu of model selection criteria when choosing among the differing parameterizations generated.
For models of the most proximal transition from each state, we followed for progression events within 20 days after dates of entry into each originating state. We considered observations to be censored if no progression event occurred within 20 days. As a sensitivity analysis, we also present estimates for models including follow-up through 60 days for these transitions. We used mixture models to estimate probabilities of and times to the most proximal progression event from each originating state. These models defined distinct rates for progression between each originating state and all higher-acuity states, and handled progression to each state as a competing risk. This framework corresponded to the interpretation that progression to a higher-acuity state of illness could precede progression to intermediate states along the same cascade.
For models of individuals’ total risk of progression to each acuity state (or higher-acuity states), we followed for progression events within 60 days after dates of entry into each originating state. When individuals experienced care corresponding to multiple states on the same day (e.g., an emergency presentation leading to hospital admission), we defined the highest-acuity state observed as the outcome. In contrast to analyses for individuals’ most proximal transition, models for individuals’ risk of progression to each state—cumulatively across all intermediate pathways of care—did not require a competing-risks framework. For these analyses, we instead recorded progression as occurring when individuals experienced the outcome of interest or one signifying receipt of higher-acuity care.
To estimate hospital lengths of stay, we fit parametric survival models defining admission dates as originating events and dates of discharge with any disposition or in-hospital mortality as the outcome; we modeled consecutive admissions with same-day readmissions as continuous hospitalization events. We also used mixture models defining competing risks for death and discharge to separately estimate durations of hospitalization according to individuals’ clinical outcome.
Differences in rates across patient subgroups could impact the validity of population-wide estimates. We therefore repeated the analyses described above to estimate subgroup-specific risks and rates of progression based on age, sex, race/ethnicity, vaccination status, Charlson comorbidity index (a weighted index of 19 different comorbidities where a higher score indicates a greater risk of mortality [21]; S2 Table in S1 File), and community-level socioeconomic status, as measured by census tract-level neighborhood deprivation index values derived from the 2017–2021 5-year estimates of the American Community Survey [22,23]. We categorized continuous covariates according to the distributions presented in Table 1. We fit parametric survival models allowing variation across covariate strata in both the probability of progression and the location parameter for times-to-event for each modeled distribution. As for our primary analyses, we describe results for models yielding the lowest AIC value for each transition in this manuscript, and present parameter estimates for all distributions in the accompanying repository. We evaluated 95% confidence intervals around estimated probabilities of progression and median times-to-events to assess whether differences across groups were statistically or epidemiologically meaningful.
Table 1: Individual characteristics by infecting virus.
Results
Descriptive characteristics
Our analyses included data from 348,958 unique KPSC members who received tests for SARS-CoV-2, influenza, or RSV between 1 April, 2023 and 31 March, 2024, among whom we identified 59,670 episodes associated with positive SARS-CoV-2 test results, 23,375 episodes associated with positive influenza test results, and 1,668 episodes associated with positive RSV test results (Table 1). Among these episodes, 602 were associated with coinfections (579 SARS-CoV-2 and influenza coinfections, 11 SARS-CoV-2 and RSV coinfections, and 12 influenza and RSV coinfections). In total, 2,737 ARI episodes occurred without associated testing for SARS-CoV-2, influenza, or RSV over the study period, and were not eligible for inclusion in our analyses (S2 Fig in S1 File). The greatest numbers of SARS-CoV-2 and influenza infections occurred among individuals aged 18–49 years (n = 23,033 [38.6%] and n = 8,475 [36.3%], respectively). For influenza and RSV, a considerable number of episodes also occurred among children aged ≤17 years (n = 7,348 [31.4%] and 920 [55.2%], respectively), while 10.7-26.7% of infections with each pathogen occurred among individuals aged ≥70 years (n = 13,911 with SARS-CoV-2, n = 2,561 with influenza, and n = 444 with RSV). Most episodes involving each pathogen occurred among Hispanic individuals of any race or White, non-Hispanic individuals without comorbid conditions (S2 Table in S1 File). Across all three pathogens, roughly half (41.2-55.6%) of all infections occurred among individuals enrolled in commercial insurance plans, and a plurality (10.4-17.9%) occurred among individuals enrolled in Medicaid-sponsored plans. Among SARS-CoV-2 infections, 6,944 (11.6%), 11,271 (18.9%), and 41,454 (69.5%) occurred among individuals who had received 0, 1–2, and ≥3 COVID-19 vaccine doses, cumulatively; 12,269 influenza infections (52.5%) occurred among individuals who had received seasonal influenza vaccination, and few RSV infections (n = 27; 1.6%) occurred among individuals who were previously vaccinated against RSV. Characteristics receiving ARI diagnoses in any setting differed from those receiving care in inpatient settings (S3 Table in S1 File).
Care pathways for SARS-CoV-2
For ARIs associated with SARS-CoV-2 infection, the first clinical encounter following symptoms onset most often occurred in urgent care (18.6%) or emergency department (17.9%) settings, followed by virtual care appointments (10.7%), outpatient office visits (7.2%), and inpatient settings (4.7%; Table 2; S4 Table; S5 Table in S1 File). Median time from symptoms onset to testing was 3.2 days (Fig 2). Among individuals who received virtual care, 21.1% subsequently received care in higher-acuity clinical settings in the following 20 days, with 7.2%, 5.0%, and 7.1%, presenting to outpatient office visits, urgent care facilities, and emergency departments as their next clinical encounter, respectively; 1.9% were admitted to hospital at their next clinical encounter (Table 2). For individuals who were admitted at their next clinical encounter after a virtual care appointment, median time to admission was 2.0 days (interquartile range [IQR]: 0.6-5.2). Among individuals who received care at outpatient and urgent care facilities, 4.0% and 1.7%, respectively, were admitted to the hospital at their next clinical encounter after a median of 0.9 and 0.4 days, respectively. We obtained similar estimates in analyses accommodating follow-up through 60 days (S6 Table; S7 Table in S1 File).
Table 2: Care utilization pathways associated with each infecting virus using a follow-up period of 20 days.
Time-to-event distributions of reaching acuity thresholds from symptom onset for illnesses associated with SARS-CoV-2, influenza, and RSV.For healthcare utilization states with two panels, the top panel illustrates the time from symptom onset to ever seeking care at that specific state, while the bottom panel shows the time from symptom onset to reaching that acuity threshold (seeking care at the state or a state more severe). Panels for positive test and death show the time from symptom onset to reaching that exact state. Black lines represent the density of the best-fitting distribution, selected by AIC.
Median duration of inpatient stay for SARS-CoV-2 infections was 4.2 days (IQR: 2.6-7.3); median time to discharge was 4.1 days (IQR: 2.6-6.9) for patients who were discharged alive (Fig 3; S8 Table in S1 File). In the 20 days following inpatient admission for SARS-CoV-2 infections, 5.1% of patients required mechanical ventilation after a median 2.6 days (IQR: 0.8-6.5; Table 2). Median time to in-hospital death was 7.3 days (IQR: 3.7-13.3). Accounting for both in-hospital and out-of-hospital mortality, the 60-day risk of death after inpatient admission was 14.1%, with 11.3% of admitted patients dying without proceeding to mechanical ventilation, and 50.2% dying after initiating mechanical ventilation (Table 3; S6 Table in S1 File).
Table 3: Care utilization pathways at each acuity threshold for each infecting virus.
Durations of hospital stay.(Top row) We plot distributions from best-fitting models for durations of hospital stay, overall and stratified according to clinical outcome. (Bottom row) We plot distributions of time from inpatient admission to initiation of mechanical ventilation.
Care pathways for influenza and RSV
Median times from symptoms onset to testing were 3.4 and 5.6 days for influenza and RSV, respectively, corresponding to differences in the clinical care settings at which testing most frequently occurred for each pathogen (Fig 2). The first clinical encounter occurred in urgent care for 36.7% of influenza cases, in emergency departments for 28.8% of cases, and in hospital settings for 3.4%; a greater proportion of confirmed RSV infections (18.7%) were first intercepted in hospital settings (Table 2).
Median durations of hospital stay for influenza cases and RSV cases were 4.0 days (IQR: 2.3-6.8) and 4.3 days (IQR: 2.5-7.4), respectively (S8 Table in S1 File). The 60-day risk of death after hospital admission was 7.9% among influenza cases and 5.0% among RSV cases (Table 3). Median time to in-hospital death following admission was 5.2 days (IQR: 2.6-10.5) among influenza cases and 11.3 days (IQR: 5.8-17.6) among RSV cases. In the 60 days following an inpatient admission, median times from admission to death were 17.5 days (IQR: 4.1-31.4) and 18.7 days (9.2-30.0) for influenza and RSV cases, respectively, who did not require mechanical ventilation, while median times from initiation of mechanical ventilation to death were 5.5 days (1.4-15.3) and 12.0 days (5.0-23.9) for influenza and RSV cases, respectively (S6 Table in S1 File).
Care requirements for all observed infections
For SARS-CoV-2 infections, median times from symptoms onset to receipt of care at or above the virtual care, outpatient physician office, urgent care, or emergency department thresholds were in the range of 3.9-4.5 days (Table 4). Overall, 7.9% of all observed SARS-CoV-2 infections resulted in inpatient admission or death, occurring a median 6.8 days (IQR: 3.6-13.2) after symptoms onset. Progression to illness necessitating mechanical ventilation and death occurred markedly later in the course of illness (median 22.8 days [IQR: 12.0-34.8] and 26.2 days [IQR: 15.2-37.7] after symptoms onset, respectively) than initial inpatient admission.
Table 4: Observed proportions of cases attaining or exceeding each acuity threshold.
Nearly all influenza and RSV infections were linked to ARI diagnoses resulting from care appointments in any setting around the time of individuals’ first eligible positive test (93.1% and 92.5%, respectively; Table 4). Median times from symptoms onset to receipt of care at virtual, outpatient, urgent care, and emergency department or higher-acuity settings were 3.2, 3.5, 3.6, and 4.0 days, respectively, for influenza, and 4.7, 4.8, 5.0, and 5.2 days, respectively, for RSV (Fig 2). Median times from symptoms onset to inpatient admission, mechanical ventilation, and death were 6.4-6.6 days, 13.4-15.3 days, and 22.6-23.7 days, respectively.
Associations of care trajectories with individual characteristics
The proportion of cases receiving care at each acuity level increased with older age for all infections; age differences were most pronounced for high-acuity outcomes (e.g., inpatient admission, mechanical ventilation, and mortality; Fig 4; S9 Table in S1 File). Median times from symptoms onset to receipt of care at or above the level of outpatient office visits increased with older age, spanning a difference of ~1 day between the ≤ 17 year and ≥90 year age groups for all three viral infections (3.3 vs. 4.7 days for SARS-CoV-2 infections, 3.2 vs. 4.4 days for influenza infections, and 4.0 vs. 4.8 days for RSV infections), although these differences across ages in times to event were attenuated for higher-acuity outcomes. Individuals with greater numbers of comorbid conditions also had higher chances of receiving care at each level of acuity and longer median times to presentation, for each virus (S10 Table)in S1 File).
Probabilities and time-to-event distributions of reaching outpatient and inpatient acuity thresholds from symptom onset for illness associated with a SARS-CoV-2 infection across demographic subgroups.The best-fitting distributions for the symptom onset to outpatient/inpatient acuity threshold were used across covariates, but the corresponding location parameter was allowed to vary by subgroup. Probabilities of reaching acuity thresholds are available in S9–S13 Tables in S1 File.
Whereas a greater proportion of males than females with SARS-CoV-2 infection experienced high-acuity outcomes (e.g., 9.9% vs. 6.7% with inpatient admission or higher-acuity outcomes, 2.1% vs. 1.1% mortality; S11 Tablein S1 File), this pattern was less clearly apparent for influenza cases and was reversed for RSV. Times to each outcome were similar for male and female cases with each infection. With regard to individuals’ vaccination status and neighborhood deprivation index values, we did not identify patterns across pathogens or across outcomes with respect to any subgroup experiencing consistently higher or lower likelihood of progression, or consistently longer or shorter times to progression (S12 Table; S13 Table in S1 File).
The probability of in-hospital mortality for SARS-CoV-2 infections was higher among older adults compared to younger adults (11.0% at ages ≥90 years versus 2.1% in 18–49 year age group; S14 Table in S1 File). There were also significant differences across Charlson comorbidity subgroups, with 9.0% of cases with a score ≥6 experiencing in-hospital mortality associated with SARS-CoV-2 infection in comparison to 3.9% mortality among cases with a score of 0. This trend was also apparent in influenza infections (7.2% vs. 1.7%), although not in RSV infections.
Median durations of hospital stay were similar across groups for each infection. Males had a longer median time to in-hospital mortality than females for SARS-CoV-2 infections (8.5 vs. 6.8 days), but shorter times to mortality for influenza and RSV infections (4.9 vs. 5.7 days and 8.1 vs. 16.1 days, respectively; S14 Table in S1 File). We did not observe differences across subgroups with respect to vaccination status, race or ethnicity, or neighborhood deprivation index in the probability of in-hospital mortality or length of hospital stays.
Github repository
In addition to the descriptive supplementary materials associated with this manuscript, we have created a Github repository containing parameter estimates for all analyses described (https://github.com/ntparker3/Resp_params). The repository includes four files for each pathogen (SARS-CoV-2, influenza, and RSV), contents of which are listed below:
Parameterized distributions and summary statistics of the proximal progression event occurring from each originating state (“first event”);Parameterized distributions and summary statistics for individuals’ to risk of progression to or above each acuity threshold, from each originating state (“event or worse”);Probabilities of progression to or above each acuity threshold, from each originating state, across subgroups of the specified individual-level covariates (“event or worse covariates”); andLocation parameters and median times to event for progression to outpatient (or higher-acuity) and inpatient (or higher-acuity) thresholds, from each originating state, across subgroups of the specified individual-level covariates (“covariate rates”).
Discussion
Our analysis provides estimates of transition rates and probabilities for healthcare utilization due to the progression of ARIs associated with SARS-CoV-2, influenza, and RSV infections. These outputs aim to inform models anticipating resource needs for healthcare systems and public health stakeholders, drawing on real-world observations within a US managed care setting. In addition to presenting aggregated results for all cases infected with SARS-CoV-2, influenza, and RSV, we present stratified results for differing subgroups for which models may aim to generate predictions; these encompassed patient demographics (age, sex, race/ethnicity), comorbidity burden, prior vaccination, and community-level socioeconomic disadvantage. Among these characteristics, we identified the strongest evidence of differences in progression risk and times-to-event across age groups and comorbidity profiles. Our outputs fill frequently-described gaps in the data needed for application of viral respiratory infection models [24–26] and may inform future forecasting efforts tailored to US healthcare contexts, particularly those aiming to inform healthcare resource allocation [10,27,28].
Previous studies have reported widely varying estimates of times from symptoms onset to hospital admission for COVID-19 [29–32] and the duration of hospital stay among COVID-19 patients [13,14,33], with both parameters differing across settings and over time within settings in association with evolving clinical practices. Whereas numerous studies have monitored patients hospitalized with each virus [34,35], fewer have tracked outcomes longitudinally from early points in the disease course such as symptoms onset or receipt of care in virtual or ambulatory facilities. Within our study, only 4.7% of COVID-19 cases (7.9% of all COVID-19 cases who received care in any setting) were first intercepted at the point of hospital admission, while among influenza and RSV cases, 3.4% and 18.7%, respectively, were first seen in inpatient settings. These circumstances suggest that projecting outcomes among individuals receiving care in lower-acuity settings may help to refine forecasts of higher-acuity clinical care needs.
Application of our estimates to forecasting models requires several assumptions or considerations. First, we frame consecutive transitions between states as memoryless, consistent with modeling approaches where estimates from these analyses may be applied (e.g., Markov chain next-state transitions as well as cumulative probabilities of attaining each state overall and from preceding states). Second, our analyses are subset to individuals who ultimately received care including diagnostic testing: events preceding testing among individuals included in these analyses, particularly in low-acuity care settings, may not represent care utilization pathways among individuals who were ultimately never tested—a problem related to previously described biases affecting interval distribution estimation [8,36,37]. Furthermore, testing for influenza and RSV was more strictly limited to individuals who received care in outpatient or inpatient facilities, whereas SARS-CoV-2 tests were widely available across all care settings. In particular, RSV testing in adults is restricted to inpatient settings, with the majority of testing, and subsequent recorded cases, being conducted in children. Thus, differences in the overall proportions of SARS-CoV-2 infections, influenza infections, and RSV infections attaining each acuity threshold should not be interpreted as differences in the severity of disease caused by each infection. The clinical threshold associated with testing in our study population may also differ from that in other healthcare systems, geographic regions, or countries. Increased testing among individuals with less-severe disease would be expected to lower the proportion of episodes expected to progress to hospital admission or other high-acuity outcomes. As this circumstance could also occur through testing at earlier stages in individuals’ illness, such increases in testing for less-severe disease could lead to longer estimated times to progression. Last, associations of the studied covariates with the proportions of cases experiencing each outcome and with times-to-event should not be interpreted as causal. In some instances, observed patterns reflect previously reported independent associations, such as associations of older age and the presence of comorbidities with severe disease outcomes [38]. However, other findings, such as the lack of association of prior vaccination with protection against severe outcomes, echo previous evidence of higher uptake of vaccines against COVID-19, seasonal influenza, and RSV among individuals at greatest risk [39–41]. These analyses aim to inform prediction even if they lack direct causal interpretation.
Our analysis has at least 7 limitations. First, KPSC represents a single healthcare system. While strengths include the integration of care delivery and data capture across outpatient and inpatient settings, the large enrollee population, and its racial/ethnic and socioeconomic diversity (13), it remains important to note that care utilization and delivery pathways may not be generalizable to all settings. Estimation of similar parameters in other US populations remains an important objective. Second, aiming to enable the broad application of our estimates, we fit parametric distributions to times-to-event that may not perfectly represent underlying processes. For this reason, we supply best-fitting parameter estimates for 6 different distributions for all times-to-event. As AIC may not adequately penalize overfitting, or multiple distributions may provide similar fit to observed data, practitioners should consider mechanistic interpretations as well as underlying assumptions of differing distribution in choosing which may be the most appropriate to modeling applications. Third, major SARS-CoV-2 variants (e.g., XBB, BA.2.86/JN.1, and KP.2) and seasonal influenza lineages (e.g., A(H1N1)pdm09, A(H3N2), B(Victoria)) circulating during the study period may not generalize to lineages circulating during future years. Fourth, the reliability of self-reported and physician-recorded symptoms onset dates may be imperfect, as signified by patterns such as heaping of times from symptoms onset around 7 days and 14 days before testing and other healthcare encounters [42]. Fifth, recovery was not explicitly recorded as a clinical outcome, necessitating censoring of observation periods without future healthcare encounters. Sixth, while restricting progression events to healthcare encounters where ARI diagnoses were assigned was anticipated to reduce misclassification, coding practices (e.g., carry-forward of diagnosis codes) may lead to misclassification of some encounters. Cessation of healthcare facility-based SARS-CoV-2 screening by the time of our study was further anticipated to mitigate risks of misclassifying healthcare encounters “with” or “for” COVID-19. Last, our analyses preceded the widespread implementation of RSV vaccines among pregnant mothers and older adults, which may alter RSV-related healthcare utilization in future seasons for population groups at the highest risk of severe disease.
These limitations notwithstanding, our analyses provide a useful entry point for modeling real-world trajectories of healthcare needs associated with SARS-CoV-2, influenza, and RSV infections. Extensive ARI-associated healthcare utilization in virtual, outpatient, and urgent care settings among persons ultimately hospitalized suggests monitoring of lower-acuity healthcare utilization may help to inform near-term hospital capacity requirements. Incorporating data on lower-acuity care delivery settings into public health surveillance and reporting thus merits consideration. Similar analyses in other geographic settings or other healthcare systems, and continued updating of parameters we report to accommodate changes in viral epidemiology or healthcare delivery practices, may improve the reliability of forecasting models for SARS-CoV-2, influenza, and RSV.
Supporting information
S1 FileSupporting information, including S1 Table (Acute respiratory illness diagnosis codes), S2 Table (Counts of patients with comorbidities included in Charlson comorbidity index), S3 Table (Individual characteristics by infecting virus and severity threshold reached); S4 Table (Best-fitting distributions for care utilization pathways for all infections using a 20-day follow-up period); S5 Table (Best-fitting distributions for care utilization pathways at each acuity threshold using a 60-day follow-up period); S6 Tale (Care utilization pathways associated with each infecting virus using a follow-up period of 60 days); S7 Table (Best-fitting distributions for care utilization pathways for all infections using a 60-day follow-up period); S8 Table (Hospital length of stay estimates for admissions leading to discharge or mortality); S9 Table (Proportions of cases attaining or exceeding each acuity threshold, by age); S10 Table (Proportions of cases attaining or exceeding each acuity threshold, by Charlson comorbidity index values); S11 Table (Proportions of cases attaining or exceeding each acuity threshold, by sex); S12 Table (Proportions of cases attaining or exceeding each acuity threshold, by neighborhood deprivation index); S13 Table (Proportions of cases attaining or exceeding each acuity threshold, by vaccination status); S14 Table (Stratified hospital length of stay estimates for admissions associated with each viral infection and leading to discharge or mortality); S1 Fig (Healthcare utilization cascade); S2 Fig (Study flowchart).(ZIP)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Hanage WP, Schaffner W. Burden of Acute Respiratory Infections Caused by Influenza Virus, Respiratory Syncytial Virus, and SARS-Co V-2 with Consideration of Older Adults: A Narrative Review. Infect Dis Ther. 2025;14(Suppl 1):5–37. doi: 10.1007/s 40121-024-01080-4 39739200 PMC 11724833 · doi ↗ · pubmed ↗
- 2Putri WCWS, Muscatello DJ, Stockwell MS, Newall AT. Economic burden of seasonal influenza in the United States. Vaccine. 2018;36(27):3960–6. doi: 10.1016/j.vaccine.2018.05.057 29801998 · doi ↗ · pubmed ↗
- 3Carrico J, Hicks KA, Wilson E, Panozzo CA, Ghaswalla P. The Annual Economic Burden of Respiratory Syncytial Virus in Adults in the United States. J Infect Dis. 2024;230(2):e 342–52. doi: 10.1093/infdis/jiad 559 38060972 PMC 11326840 · doi ↗ · pubmed ↗
- 4Loo SL, Howerton E, Contamin L, Smith CP, Borchering RK, Mullany LC, et al. The US COVID-19 and Influenza Scenario Modeling Hubs: Delivering long-term projections to guide policy. Epidemics. 2024;46:100738. doi: 10.1016/j.epidem.2023.100738 38184954 PMC 12444780 · doi ↗ · pubmed ↗
- 5Ferguson N, Laydon D, Nedjati G i l a n i G, Imai N, Ainslie K, Baguelin M, et al. Report 9: Impact of non-pharmaceutical interventions (NP Is) to reduce COVID 19 mortality and healthcare demand. 2020. 10.25561/77482 · doi ↗
- 6Presanis AM, De Angelis D, New York City Swine Flu Investigation Team, Hagy A, Reed C, Riley S, et al. The severity of pandemic H 1N 1 influenza in the United States, from April to July 2009: a Bayesian analysis. P Lo S Med. 2009;6(12):e 1000207. doi: 10.1371/journal.pmed.1000207 19997612 PMC 2784967 · doi ↗ · pubmed ↗
- 7Shaman J, Karspeck A. Forecasting seasonal outbreaks of influenza. Proc Natl Acad Sci U S A. 2012;109(50):20425–30. doi: 10.1073/pnas.1208772109 23184969 PMC 3528592 · doi ↗ · pubmed ↗
- 8Miller AC, Hannah LA, Futoma J, Foti NJ, Fox EB, D’Amour A, et al. Statistical Deconvolution for Inference of Infection Time Series. Epidemiology. 2022;33(4):470–9. doi: 10.1097/EDE.0000000000001495 35545230 PMC 9148632 · doi ↗ · pubmed ↗
