The B-S2CALED Score’s Utility in Predicting Stroke Risk in Breast Cancer Patients with Atrial Fibrillation
Lakshya Seth, Nickolas Stabellini, Aditya Bhave, Gaurav Gopu, Sandeep Yerraguntla, Ahmed Shetewi, John Lester, Vraj Patel, Stephanie Jiang, Madison James, Stanley Joseph, Sai Kollapaneni, Viraj Shah, Susan Dent, Michael G. Fradley, Lars Køber, Anne Blaes, Avirup Guha

TL;DR
A new score called B-S2CALED better predicts stroke risk in breast cancer patients with atrial fibrillation compared to existing methods.
Contribution
Development and validation of a breast cancer-specific stroke risk score for patients with atrial fibrillation.
Findings
B-S2CALED outperformed CHA2DS2-VASc in predicting stroke risk in breast cancer patients with atrial fibrillation.
The new score showed higher discrimination with C-indexes of 0.64 and 0.77 in internal and external validation cohorts.
Net reclassification improvement was significantly higher for B-S2CALED compared to CHA2DS2-VASc.
Abstract
Breast cancer patients have a higher risk of atrial fibrillation and ischemic stroke than the general population, and standard ischemic risk scores are poorly validated in cancer patients. This study developed and validated a novel score to predict ischemic stroke risk in breast cancer patients with atrial fibrillation. This breast cancer-specific score outperformed CHA2DS2-VASc in predicting thromboembolic risk in cancer patients. Background: Breast cancer (BC) patients have heightened risks of atrial fibrillation (AF) and ischemic stroke (IS). Standard IS scores are poorly validated in cancer, omit cancer-specific factors, and guidelines offer no cancer-tailored management. Objectives: To develop and validate a novel score to predict IS risk in BC patients with AF. Methods: Data sources: UH Seidman Cancer Center (derivation; 40% set aside for internal validation) and MCG Cancer…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —American Heart Association
- —U.S. Department of Defense Prostate Cancer Research Program
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAtrial Fibrillation Management and Outcomes · Venous Thromboembolism Diagnosis and Management · Cerebrovascular and Carotid Artery Diseases
1. Introduction
Atrial fibrillation (AF) is the most common arrhythmia, affecting ≈59 million people worldwide and carrying a lifetime risk of 1 in 3 after age 45 [1]. Among BC patients, a recent meta-analysis found that the prevalence of AF was 3% [2]. AF increases ischemic stroke (IS) risk five-fold, yet guideline-directed anticoagulation lowers that risk by ≈65% [3]. Cancer further elevates risk: patients have a 63% higher AF incidence [4] and nearly doubles the IS rate [5] versus the general population—likely due to shared risk factors [4], cancer-related physiologic changes [6,7], and antineoplastic cardiotoxicity and thrombogenicity [4,7,8]. One study found that breast cancer survivors under 40 years old had a more than two-fold increased risk of AF, but this risk was not present in older BC survivors, particularly those older than 65 years old [9].
The CHA_2_DS_2_-VASc score, validated for IS prediction in the general AF population, performs poorly in cancer cohorts [10,11] and ignores cancer-specific thromboembolic drivers [7]. In the UK Biobank, its C-index was only 0.57 across cancers and 0.62 in breast cancer (BC) survivors [12]. The 2022 European Society of Cardiology (ESC) guidelines acknowledge the underestimation of risk in oncology patients but offer no alternative [13].
BC is an ideal test case: both the disease and its treatments (anthracyclines, radiation, and endocrine therapy) heighten AF risk [14,15,16,17] and IS [18,19]. New-onset AF after BC diagnosis worsens cardiovascular mortality [15], and BC ranks among the cancers with the highest fatal stroke burden [19].
Given the limited evidence and exclusion of cancer patients from validation studies [20,21], we aimed to create and externally validate a BC-specific score that improves IS/transient ischemic attack (TIA) risk stratification in patients who develop AF after cancer diagnosis. This new score would offer a tailored approach to each patient and incorporate variables that are readily available to clinicians in the electronic medical record to standardize stroke prevention across clinical practice. A BC-specific score would guide clinicians in deciding which patients with AF would benefit the most from initiation of anticoagulation.
2. Methods
2.1. Inclusion Criteria
We included adults (≥18 yr) with ductal carcinoma in situ (DCIS) or stage I–IV BC who developed AF after diagnosis. Patients lacking diagnosis, death, or last follow-up dates were excluded. Patient records were deidentified, and the study was approved by the Institutional Review Board (IRB).
2.2. Outcome
Primary outcome: IS or TIA after AF onset. TIA was included alongside stroke to increase power, given the low event rate in this cancer-specific cohort. While we acknowledge the diagnostic subjectivity of TIA, both outcomes represent cerebral thromboembolic risk and may share overlapping risk profiles, especially in AF populations.
2.3. Development
Using the University Hospitals (UH) integrated oncology repository, a large hybrid academic-community tertiary care center in Northeast Ohio, USA, that serves urban, suburban, and rural areas [22], we divided the cohort into derivation (50%), tuning (10%), and internal validation (40%) sets [22,23,24,25]. Eligible patients were diagnosed 2010–2019, ensuring ≥2 yr follow-up (through March 2022). AF, IS, and TIA were identified via ICD codes (Supplemental Table S1).
For the internal validation cohort, we included patients diagnosed with BC between January 2010 and March 2019, providing a follow-up of at least two years through the data collection date (March 2022). Diagnoses of AF, IS, and TIA were captured using ICD codes (Supplemental Table S1). A CONSORT diagram was created to summarize the selection of patients from both cohorts (Supplemental Figure S1).
The covariates encompassed demographics, lifestyle factors, individual-and neighborhood-level social determinants of health (SDOH), cancer characteristics, cancer treatments, comorbidities, medical history, current medications, and laboratory data prior to BC diagnosis (Supplemental Table S2). All variables were collected by medical students (LS, AB, GG, SY, AS, JL, VP, SJ, MJ, SJ, SK), assisted by AG for the first 10 collections and then randomly checked every 5 to 6 charts collected. Medical student abstractors achieved excellent inter-rater reliability (κ = 0.87).
2.4. Statistical Analyses
We applied the Least Absolute Shrinkage and Selection Operator (LASSO) Cox regression for variable selection [26]. We used LASSO because it utilizes shrinkage to reduce overfitting and incorporates the joint effect of covariates [27], is not dependent on statistical significance or 95% confidence intervals (CIs), and allows for the inclusion of covariates that may not reach traditional statistical significance thresholds if it seems that the covariates could improve predictive performance [26,28]. This approach is more relevant for risk calculators, where predictors are considered in combination. Variable selection was performed via LASSO with 10-fold cross-validation to determine the optimal lambda. Variables with non-zero coefficients were retained, then transformed into categorical variables using spline-based cutoffs. The selected covariates were determined based on the LASSO cross-validated model’s coefficients at the optimal regularization parameter (lambda) [29]. Continuous predictors were binarized with cubic-spline cut points where HR ≥ 1.00 [30].
After categorization, all covariates were included in a multivariable Cox regression model, from which point values were derived based on the magnitude and precision of the hazard ratios. To transform the final multivariable Cox regression model into an easily usable clinical score similar to CHA_2_DS_2_-VASc, covariates with 95% CI > 1.00 received two points; those whose CI crossed 1.00 received one. The maximum possible score is 10 (Table 1). Cubic spline plots were generated to assess nonlinear relationships between total score and predicted event rates. Inflection points where predicted hazard ratios changed slope informed the threshold for score-based risk categories as follows: 0 = none, 1–4 = low/intermediate, and >4 = high. Score values associated with an HR and 95% CI > 1.00 were classified as high risk, those with an HR and 95% CI crossing 1.00 as low/intermediate risk, and those with an HR and 95% CI < 1.00 as no risk. The no-risk group was defined as such, given the observed risk is lower than the population’s expected risk of IS/TIA, which is higher than seen in our cohort of those with a score of 0 [15].
2.5. External Validation
The external validation of the score utilized retrospective data from the Medical College of Georgia (MCG) Cancer Center, an academic tertiary care center that serves much of the urban, suburban, and rural areas in the region. Using manually curated BC records (2010–2022), we identified patients who developed AF post-cancer. Diagnoses and covariates matched the UH variables when available. (Supplemental Table S2).
2.6. CHA2DS2-VASc Calculation
We calculated CHA_2_DS_2_-VASc (maximum = 9) per consensus definitions (Supplemental Table S3) [31,32]. In this population that solely consisted of females, a score of ≤2 was categorized as low/medium risk, while a score of ≥3 was categorized as high risk, based on guidelines that give an Ia class recommendation to initiate anticoagulation in those with a CHA_2_DS_2_-VASc score of ≥2 in men and ≥3 in women [33].
2.7. Performance
For both scores, we computed concordance indices (C-index) with 95% CI [34], confusion matrices, and categorical/continuous net reclassification improvement (NRI) [35]. The categorical NRI was computed using the pre-existing risk categories of low/moderate versus high for CHA_2_DS_2_-VASc and none, low-intermediate, and high for the novel score. Since, in addition to the increased thromboembolic risk, BC patients may be at increased risk for bleeding, the intermediate risk category allows for permissive cardiotoxicity [36] by not anticoagulating despite the thromboembolic risk, which could be useful in cancers that increase bleeding risk. The continuous NRI was computed using raw scores. A positive NRI indicated improved predictability of the novel score versus CHA_2_DS_2_-VASc [35]. A Kaplan–Meier curve stratified by risk category assessed whether the model appropriately discriminated between patients at different levels of predicted risk. In a well-calibrated model, the low-risk group would exhibit higher AF-free survival than the medium- and high-risk groups over time. This is seen in our curve, which separates patients by outcome risk, even if no absolute risk values are reported (Supplemental Figure S2). Reporting of this prediction model study adheres to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines [37] (Supplemental Figure S3).
2.8. Descriptive Analysis
The data were presented as absolute values and percentages for categorical variables and as mean with standard deviation (SD) or median with interquartile range (IQR) for continuous variables, depending on the data distribution. Categorical variables were compared using Pearson’s χ^2^ test. Data distribution was checked using histograms and the Kolmogorov–Smirnov test, followed by independent samples t-tests for normally distributed variables and nonparametric Kruskal–Wallis tests for non-normally distributed variables. The p-value of <0.05 was considered significant.
Software and Packages
All statistical analyses were performed using R (v4.4.2) [38]. The BiocManager [39], boot [40], prodlim [41], survcomp [42], and survival [43] packages were used.
3. Results
3.1. Population
Out of the 10,700 total patients in the UH database, 935 were diagnosed with AF after BC diagnosis, with a median follow-up of 2.37 years (IQR 0.64–5.17) after BC diagnosis. The median age at cancer diagnosis was 74 (IQR: 66–80). In terms of cancer treatment, 38.0% of patients underwent surgery, 38.0% of patients received endocrine therapy, 24.0% of patients received chemotherapy, 4.1% received immunotherapy, and 28.2% received radiation therapy (Table 2). In the internal validation cohort, 87 patients had an IS/TIA after AF diagnosis. The significant differences between the 848 patients who did not have an IS/TIA after AF diagnosis and the 87 patients who did have an IS/TIA after AF diagnosis were dyslipidemia (without IS/TIA: 74.5%, with IS/TIA: 86.2%, p = 0.02), statin use (without IS/TIA: 53.7%, with IS: 75.9%, p < 0.001), antihypertensive use (without IS/TIA: 83.6%, with IS: 93.1%, p = 0.02, aspirin use (without IS/TIA: 57.0%, with IS: 79.3%, p < 0.001), prior stroke/TIA/embolism (without IS/TIA: 13.7%, with IS: 37.9%, p < 0.001), and prior vascular disease (without IS/TIA: 39.9%, with IS: 64.4%, p <0.001; Table 2).
Out of the 2274 total patients in the MCG database, 95 patients were diagnosed with AF after BC diagnosis, with a median follow-up of 5.85 years (IQR: 2.74–8.23) after BC diagnosis. The median age at cancer diagnosis was 70 (IQR: 64–74.5). In terms of cancer treatment, 95.79% of patients underwent surgery, 70.3% of patients received endocrine therapy, 25.3% received chemotherapy, 7.4% received immunotherapy, and 61.1% received radiation therapy (Table 2). Eight patients from the validation cohort had an IS/TIA after AF diagnosis. No significant differences in baseline demographics and cardiovascular (CV) events after BC diagnosis were seen between the 87 patients who did not have an IS/TIA after AF diagnosis and the eight patients who did have an IS/TIA after AF diagnosis (Table 2).
3.2. Novel Score
The covariates selected from the internal validation cohort by LASSO regression were body mass index, smoking history (current or previous), CKD, antihypertensive use, lipid-lowering therapy (statin use), ethnicity (Black race), diabetes, and stroke/TIA/embolism (prior) (Supplemental Table S4). The acronym B-S_2_CALED is used to capture the individual components of the score. Among these, BMI was categorized as binary using a cutoff of 30. After BMI categorization and based on the updated LASSO model, positive smoking history and prior stroke/TIA/embolism were assigned two points each, while all other components were assigned one point each, resulting in a maximum possible score of 10 (Table 1). Risk categories were defined as 0 = no risk, 1–4 = low/intermediate risk, and >4 = high risk (Supplemental Table S5).
3.3. Comparison of the B-S2CALED Score with CHA2DS2-VASc
In the internal validation cohort, the B-S_2_CALED score performed better than CHA_2_DS_2_-VASc in a categorical model (B-S_2_CALED C-index = 0.64 [95% CI: 0.59, 0.70] versus CHA_2_DS_2_-VASc C-index = 0.54 [95% CI: 0.51, 0.56]) and a continuous model (B-S_2_CALED C-index = 0.68 [95% CI: 0.62, 0.73] versus CHA_2_DS_2_-VASc C-index = 0.64 [95% CI: 0.58, 0.70]). Compared to CHA_2_DS_2_-VASc, B-S_2_CALED demonstrated an NRI of 0.188 for the categorical score and 0.150 for the continuous score (Table 3).
In the external validation cohort, the B-S_2_CALED score performed better than CHA_2_DS_2_-VASc in a categorical model (B-S_2_CALED C-index = 0.77 [95% CI: 0.72, 0.83] versus CHA_2_DS_2_-VASc C-index = 0.53 [95% CI: 0.51, 0.56]) and a continuous model (B-S_2_CALED C-index = 0.86 [95% CI: 0.79, 0.94] versus CHA_2_DS_2_-VASc C-index = 0.70 [95% CI: 0.51, 0.89]). Compared to CHA_2_DS_2_-VASc, B-S_2_CALED demonstrated an NRI of 0.563 for the categorical score and 0.695 for the continuous score (Table 3).
4. Discussion
This study presents the first externally validated, breast cancer–specific algorithm (B-S_2_CALED) for estimating ischemic stroke and transient-ischemic-attack risk in patients who develop atrial fibrillation after their cancer diagnosis. Across an internal cohort of 935 patients and an external cohort of 95 patients, B-S_2_CALED demonstrated materially higher discrimination and superior net reclassification relative to CHA_2_DS_2_-VASc, the current clinical standard. Additionally, the three different levels of the B-S2CALED score allow for risk stratification of thromboembolic risk by incorporating a middle risk tier, which allows for shared decision-making in high-risk cases. This is in contrast to CHA2DS2-VASc, which would consider initiating anticoagulation with a score as low as one, and often would entail that almost every cancer patient would need anticoagulation. These findings address a critical gap highlighted by recent American Heart Association and European Society of Cardiology guidance, both of which acknowledge inadequate performance of conventional AF risk tools in oncology populations, yet offer no cancer-tailored alternative [13,44].
Both breast cancer and its prevalent therapies—anthracyclines, thoracic radiation, and endocrine manipulation—foster pro-inflammatory, pro-thrombotic, and arrhythmogenic states that may potentiate AF and cerebral embolism [14,15,16,17,18,19]. Traditional scores, designed decades before routine cardio-oncology surveillance, omit these cancer-specific influences as well as key social-determinant and comorbidity profiles that differ markedly between oncology and non-oncology cohorts [7,11,44]. By incorporating chronic kidney disease, obesity, smoking history, and Black race—variables repeatedly associated with heightened stroke incidence and severity [45,46,47,48]—age did not retain independent prognostic weight in our LASSO-derived model. This observation supports the concept that biological aging, accelerated by malignancy and its treatment, may supersede chronological age as a determinant of vascular events in cancer survivors [49,50,51,52]. Future work should explore objective geroscience metrics (e.g., epigenetic clocks, telomere attrition) as candidate variables in cardio-oncology risk stratification.
Prior efforts to refine thromboembolic prediction in cancer patients have either appended a single “cancer” modifier to CHA_2_DS_2_-VASc [53] or tested venous-thromboembolism scores (e.g., Khorana) for arterial events, with limited success [54,55]. The UK Biobank analysis reported modest CHA_2_DS_2_-VASc performance in breast cancer (C-index 0.62) [12]. Our score achieved C-indices of 0.68 (internal) and 0.86 (external), plus positive NRI values in both cohorts, thereby yielding clinically meaningful improvement while retaining simplicity akin to CHA_2_DS_2_-VASc (B-S_2_CALED: max 10 points, three risk tiers).
Suboptimal anticoagulation remains common in cancer patients with AF because clinicians justifiably fear bleeding in a population already exposed to thrombocytopenia, drug–drug interactions, and procedural interventions [56]. A more accurate, oncology-specific risk tool can facilitate nuanced shared decision-making: patients classified as “high risk” by B-S_2_CALED may derive net benefit from anticoagulation despite bleeding concerns, whereas those deemed “low/intermediate” might safely defer therapy, particularly when receiving cardiotoxic regimens or undergoing invasive procedures. Moreover, the score relies on universally available clinical data, avoiding barriers that hamper the implementation of more complex calculators [57], thereby increasing the utility of the score in a clinical setting. Prospective, multicenter validation with adjudicated endpoints and concurrent bleeding assessment is therefore essential before broad adoption.
5. Strengths and Limitations
Strengths include the following: (i) a transparent, data-driven variable-selection process; (ii) sizeable internal training and validation subsets; (iii) external validation in an independent health-system cohort with greater Black representation than most cardiology or oncology trials [58,59,60]; and (iv) retention of an intuitive, point-based format conducive to bedside use. Limitations stem from retrospective design; potential misclassification through ICD coding; incomplete treatment documentation in the electronic medical record—especially of anticoagulant exposure—which could impact IS/TIA rates; inclusion of ductal carcinoma in situ; inclusion of TIA diagnosis via ICD coding, which lacks the objectivity of neuroimaging confirmed stroke and may introduce misclassification bias, but this was mitigated by manual checking of charts to ensure clinical correlation with ICD-10 coding; and modest event counts—particularly in the external cohort—that could inflate performance estimates. Additionally, higher medication use in the IS/TIA group may reflect reverse causation—i.e., therapy initiated post-event. This could affect the interpretability of statin and antihypertensive associations, as the timing of medication exposure was not consistently captured.
6. Conclusions
B-S_2_CALED score offers a pragmatic, biologically plausible advance over the CHA_2_DS_2_-VASc score for predicting stroke risk in breast cancer patients who develop atrial fibrillation. While confirmation in larger prospective cohorts is required, our findings lay the groundwork for cancer-specific risk stratification paradigms that reconcile thromboembolic prevention with bleeding hazards in an expanding cardio-oncology population.
Clinical Perspectives
C****ompetency in Medical Knowledge: This study emphasizes the importance of derivation of tumor-specific scores for high-incidence cancers with known thromboembolic proclivity (lung, colorectal, prostate).
Translational Outlook: Expanding B-S_2_CALED validation to larger, geographically diverse registries will test generalizability and permit recalibration. Integration of circulating biomarkers (e.g., D-dimer, high-sensitivity CRP) and quantitative imaging indices of cardio-oncologic injury may further enhance accuracy without sacrificing usability. Finally, clinical-decision support embedding B-S_2_CALED into electronic health records could standardize stroke-prevention strategies across oncology practices.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Linz D. Gawalko M. Betz K. Hendriks J.M. Lip G.Y.H. Vinter N. Guo Y. Johnsen S. Atrial Fibrillation: Epidemiology, Screening and Digital Health Lancet Reg. Health—Eur.20243710078610.1016/j.lanepe.2023.10078638362546 PMC 10866942 · doi ↗ · pubmed ↗
- 2Yao X. Hu Q. Liu X. Ling Q. Leng Y. Zhao H. Yu P. Ma J. Zhao Y. Liu M. Atrial Fibrillation and Breast Cancer-Vicious Twins? A Systematic Review and Meta-Analysis Front. Cardiovasc. Med.202310111323110.3389/fcvm.2023.111323136970342 PMC 10036368 · doi ↗ · pubmed ↗
- 3Jones N.R. Taylor C.J. Hobbs F.D.R. Bowman L. Casadei B. Screening for Atrial Fibrillation: A Call for Evidence Eur. Heart J.2020411075108510.1093/eurheartj/ehz 83431811716 PMC 7060457 · doi ↗ · pubmed ↗
- 4Yun J.P. Choi E.-K. Han K.-D. Park S.H. Jung J.-H. Park S.H. Ahn H.-J. Lim J.-H. Lee S.-R. Oh S. Risk of Atrial Fibrillation According to Cancer Type JACC Cardio Oncol.2021322123210.1016/j.jaccao.2021.03.00634396327 PMC 8352078 · doi ↗ · pubmed ↗
- 5Navi B.B. Reiner A.S. Kamel H. Iadecola C. Okin P.M. Elkind M.S.V. Panageas K.S. De Angelis L.M. Risk of Arterial Thromboembolism in Patients with Cancer J. Am. Coll. Cardiol.20177092693810.1016/j.jacc.2017.06.04728818202 PMC 5667567 · doi ↗ · pubmed ↗
- 6Navi B.B. Kasner S.E. Elkind M.S.V. Cushman M. Bang O.Y. De Angelis L.M. Cancer and Embolic Stroke of Undetermined Source Stroke 2021521121113010.1161/STROKEAHA.120.03200233504187 PMC 7902455 · doi ↗ · pubmed ↗
- 7Seth L. Stabellini N. Doss S. Patel V. Shah V. Lip G. Dent S. Fradley M.G. Køber L. Guha A. Atrial Fibrillation and Ischemic Stroke in Cancer: The Latest Scientific Evidence, Current Management, and Future Directions J. Thromb. Thrombolysis 202511410.1007/s 11239-025-03104-340281267 · doi ↗ · pubmed ↗
- 8Guha A. Dey A.K. Jneid H. Ibarz J.P. Addison D. Fradley M. Atrial Fibrillation in the Era of Emerging Cancer Therapies Eur. Heart J.2019403007301010.1093/eurheartj/ehz 64931541552 PMC 6933869 · doi ↗ · pubmed ↗
