Fuzzy Logic Approaches for Causal Inference in Health Care: Systematic Review
Jaime Jamett, Patricio Melendez, Ximena Collao-Ferrada, Karina Cordero-Torres, Alejandro Veloz

TL;DR
This paper reviews how fuzzy logic has been used in healthcare for causal inference, finding that while it's flexible and interpretable, its use for clear causal analysis is still limited and needs more rigorous methods.
Contribution
The paper systematically reviews fuzzy logic applications for causal inference in healthcare, highlighting gaps and suggesting integration with formal causal frameworks.
Findings
37 studies applied fuzzy logic in healthcare for causal questions, mostly using fuzzy inference systems and cognitive maps.
Only 2 studies explicitly used formal causal inference frameworks, with most relying on predictive or associative modeling.
Fuzzy approaches showed mixed performance compared to comparator models, with moderate to high risk of bias in most studies.
Abstract
Fuzzy logic has been progressively investigated as a viable alternative to traditional statistical and machine learning methods in health care modeling, especially in environments marked by uncertainty, nonlinearity, and missing information. Although its use in prediction, classification, and risk stratification is well established, its application to explicit causal inference remains limited, varied, and methodologically premature. This systematic review aimed to examine how fuzzy logic frameworks have been used to address causal questions in health care, focusing on their methodological characteristics, comparative performance, and degree of integration with formal causal inference approaches. A systematic search across 6 databases (PubMed, Web of Science, ScienceDirect, SpringerLink, Scopus, and IEEE Xplore) identified peer-reviewed studies published between 2014 and 2025 that…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5| Database | Search strategy |
|---|---|
| PubMed | (“Fuzzy Logic” [MeSH] OR “Fuzzy logic” [Title/Abstract] OR “Fuzzy modelling” [Title/Abstract] OR “Fuzzy inference system*” [Title/Abstract]) |
| Web of Science, ScienceDirect, Springer, IEE Xplore, and Scopus | (“fuzzy logic” OR “fuzzy modelling” OR “fuzzy inference system*”) |
| Fuzzy method | Abbreviation | Primary analytical role | Typical application in reviewed studies | Studies, n (%) |
|---|---|---|---|---|
| Fuzzy inference system (Mamdani-type and variants) | FIS | Rule-based modeling under uncertainty | Prediction, classification, decision support | 8 (21.62) |
| Fuzzy analytic hierarchy process | FAHP | Multicriteria decision analysis | Risk prioritization, decision support | 6 (16.22) |
| Fuzzy cognitive maps | FCM | Conceptual modeling of interacting variables | Simulation of influence structures, exploratory causal reasoning | 5 (13.51) |
| Adaptive neuro-fuzzy inference system | ANFIS | Hybrid learning and fuzzy inference | Prediction and pattern recognition | 3 (8.11) |
| Hybrid fuzzy models combined with MCDM | Hybrid Fuzzy + MCDM | Multicriteria decision support | Structural prioritization and ranking | 3 (8.11) |
| Fuzzy clustering (C-means/K-means) | — | Unsupervised pattern discovery | Grouping and exploratory analysis | 2 (5.41) |
| Fuzzy-trace theory | FTT | Cognitive decision modeling | Behavioral and decision-making analysis | 2 (5.41) |
| Fuzzy-set qualitative comparative analysis | fsQCA | Configurational causal analysis | Identification of necessary and sufficient conditions | 2 (5.41) |
| Takagi-Sugeno fuzzy models | TS/TSK | Rule-based functional approximation | Predictive modeling | 1 (2.70) |
| Fuzzy failure mode and effects analysis | F-MEA | Risk and failure assessment | Safety and risk analysis | 1 (2.70) |
| Mediative fuzzy logic | MFL | Decision mediation modeling | Clinical decision support | 1 (2.70) |
| Fuzzy evidential reasoning | FER | Evidence aggregation | Decision support under uncertainty | 1 (2.70) |
| Likelihood-fuzzy analysis | LFA | Probabilistic-fuzzy integration | Risk estimation | 1 (2.70) |
| Profile-based fuzzy association rule mining | PB-FARM | Pattern and rule discovery | Association analysis | 1 (2.70) |
| Study (year) | Domain | Task/outcome | Fuzzy method | Dataset size | Data | Comparator | Primary metric | CI |
|---|---|---|---|---|---|---|---|---|
| Amirkhani et al (2014) [ | Other | Autoimmune hepatitis | NFCM+NFIS | M | Inst | Direct: ANFIS | AUC 89.8 | I |
| Lee et al (2015) [ | ID | HIV prevalence (policy) | fsQCA | L | Pub | None | Consistency 0.95 | E |
| Maranate et al (2015) [ | Other | OSA severity | FAHP | L | Inst | None | Sens 92.3 | P |
| Subramanian et al (2015) [ | Cancer | Breast cancer risk | L2-FCM | S | Synth | Direct: FCM | AUC 94.3 | P |
| Wolfe et al (2015) [ | Cancer | Risk decision-making | FTT | M | Pub | Direct: RCT control | NR | P |
| Mollalo and Khodabandehloo (2016) [ | ID | Leishmaniasis risk map | FAHP+GIS | L | Inst | Base | AUC 90.5 | P |
| Yılmaz et al (2016) [ | Cancer | Lung cancer | ANFIS-MEP | L | Inst | Direct: ANFIS, EP | AUC 94.6 | P |
| Pota et al (2017) [ | Cancer | Radiotherapy side effects | LFA | S | Inst | Direct: NB | AUC 0.81 | P |
| Stanković and Stanković (2017) [ | Cancer | Prostate survival | Neuro-fuzzy | S | Inst | Direct: ANN, FIS | P | |
| Iancu (2018) [ | CVD | CVD diagnosis | MFL | NA | Synth | None | NR | P |
| Sabahi (2018) [ | CVD | CHD risk ranking | BFAHP | NA | Exp | Direct: AHP | AUC 0.86 | P |
| Saleh et al (2018) [ | CVD | Diabetic retinopathy | ANFIS | M | Inst | Direct: RF, MLP, kNN | AUC 0.84 | P |
| Argyropoulos et al (2019) [ | Other | AKI stage-3 risk | TSK | L | Inst | Direct: LR | AUC 0.95 | P |
| Romero et al (2019) [ | ID | Dengue risk | FIS-Mamdani | NA | Pub | None | AUC >0.86 | P |
| Sarkar et al (2019) [ | ID | Malaria ecological risk | FIS+AHP | L | Mixed | Base | NR | P |
| Souza et al (2019) [ | PTB | PTB phenotypes | Fuzzy clustering | L | Inst | None | NR | P |
| Boni et al (2020) [ | CVD | CVD in dialysis | FIS-Mamdani | M | Inst | None | AUC 0.92 | P |
| Hynek et al (2020) [ | Mental | Refugee mental health | FCM | S | Exp | None | NR | I |
| Mahmoodi et al (2020) [ | Cancer | Gastric cancer | FCM-NHL | M | Inst | Direct: ANN, SVM, DT, NB | AUC 95.8 | I |
| Piyatilake and Perera (2020) [ | ID | Dengue clusters | FAHP | L | Pub | Base | AUC 0.73 | P |
| Malakoutikhah et al (2021) [ | OHS | MSD risk (steel) | FIS-Mamdani | M | Mixed | None | P | |
| Shi et al (2021) [ | ID | Outbreak risk | FER | S | Exp | None | α=0.79 | P |
| Yavari et al (2021) [ | CVD | Heart disease profiling | PB-FARM | M | Pub | Direct | Conf 0.73 | P |
| Mohandes et al (2022) [ | OHS | Construction safety | IVIF-DEMATEL+ANP | S | Pub | Direct | α=0.74 | E |
| Safaei et al (2022) [ | CVD | Obesity model | MFRBS+DEMATEL | L | Pub | None | NR | I |
| Barbounaki and Sarantaki (2022) [ | PTB | PTB risk assessment | FAHP | M | Inst | Base | NR | P |
| Brust-Renck and Reyna (2023) [ | Cancer | Cancer risk decisions | FTT | L | Inst | Base | NR | P |
| Aydın and Özkan (2024) [ | CVD | LMIC cardiovascular risk profiling | IVPF-AHP+TOPSIS | L | Inst | None | NR | P |
| Benito et al (2024) [ | ID | COVID/dengue | FCM+LAMDA | L | Pub | Direct: RF, LAMDA | AUC 0.89 | I |
| Chen et al (2024) [ | Mental | Child depression | fsQCA+OLS | M | Mixed | None | Consistency 0.867 | I |
| Costa et al (2024) [ | ID | Leishmaniasis risk | FIS-Mamdani | L | Pub | None | NR | P |
| Sakinala et al (2024) [ | OHS | Mining MSD risk | FIS-Mamdani | S | Inst | Base | P | |
| Sümbül-Şekerci et al (2024) [ | Other | T2DM cognition | FCM+CRT | M | Inst | Direct: CRT | AUC 0.91 | P |
| Upadhyay et al (2024) [ | OHS | Iron ore MSD risk | FIS-Mamdani | S | Inst | None | NR | P |
| Demir and Sabır (2025) [ | OHS | Workplace risk | F-FMEA | S | Exp | Direct: FMEA | NR | P |
| Rani and Dhanasekar (2025) [ | ID | Zika risk factors | Type-2 FS+MCDM | NA | Exp | None | P | |
| Scrobota et al (2025) [ | ID | Periodontitis (T2DM) | FIS-Mamdani | S | Inst | None | P |
| Study (year) | Domain | Fuzzy method | Comparators | Reported metrics | Comparative outcome |
|---|---|---|---|---|---|
| Amirkhani et al (2014) [ | Other (AIH | NFCM | NFIS, ANFIS | Acc | Neuro-fuzzy cognitive map improved explainability; performance comparable |
| Subramanian et al (2015) [ | Cancer (breast) | L2-FCM | Standard FCM | Acc 94.3 vs 92.6 | Layered FCM improved accuracy and interpretability |
| Yılmaz et al (2016) [ | Cancer (lung) | ANFIS-MEP | ANFIS | Acc 94.6 vs 92.6; RMSE | Neuro-fuzzy model achieved higher accuracy and faster convergence |
| Pota et al (2017) [ | Cancer (RT | LFA | Naïve Bayes | Acc 0.81 vs 0.84; mixed sensitivity/specificity | Comparable accuracy: fuzzy model offered rule-based interpretability |
| Stanković and Stanković (2017) [ | Cancer (prostate) | Neuro-fuzzy | ANN | Neuro-fuzzy slightly outperformed ANN and standard FIS | |
| Saleh et al (2018) [ | CVD | ANFIS | RF | Acc 84.2 vs 77.3 (DRSA) | ANFIS achieved the best accuracy among the tested classifiers |
| Sabahi (2018) [ | CVD (CHD | BFAHP | AHP | Acc 85.9 vs 77.3 | Fuzzy AHP showed greater robustness under uncertainty |
| Argyropoulos et al (2019) [ | AKI | TSK | Logistic regression | AUC | Equivalent AUC; fuzzy gained sensitivity in some models |
| Mahmoodi et al (2020) [ | Cancer (gastric) | FCM-NHL | ANN, SVM | Acc 95.8 vs 90.5 (ANN) | FCM-NHL achieved the highest predictive accuracy across methods |
| Yavari et al (2021) [ | CVD (heart disease) | PB-FARM | Association rule/classification methods | Support/confidence | Fuzzy association mining extracted higher-confidence clinical rules |
| Mohandes et al (2022) [ | OHS | IVIF-DEMATEL | IVIF-ANP | Reliability α=0.74 | Hybrid fuzzy method prioritized causal factors with higher consistency |
| Benito et al (2024) [ | ID | FCM+LAMDA | RF | AUC 0.89 vs 0.98 (RF) | RF outperformed in accuracy; fuzzy models offered stronger explainability |
| Sümbül-Şekerci et al (2024) [ | Other (T2DM | FCM+CRT | CRT | AUC 0.91 (cluster 1) | Fuzzy clustering identified cognitive subgroups; CRT supported classification |
| Demir and Sabır (2025) [ | OHS | F-FMEA | Classical FMEA | NR | Fuzzy FMEA reduced subjectivity in risk prioritization |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Qualitative Comparative Analysis Research · Cognitive Science and Mapping
Introduction
Health care is strongly influenced by uncertainty. Clinical and public health decisions are routinely made under conditions of incomplete, ambiguous, or imprecise information, arising not only from individual patient variability but also from the complexity of health care systems and the processes through which real-world data are generated [1]. Such uncertainty encompasses both stochastic variability and epistemic constraints and is further amplified by heterogeneous, noisy, and nonlinear data from electronic health records, diagnostic imaging, physiological signals, and population-level monitoring systems [2]. Under these conditions, conventional statistical approaches—typically relying on fixed thresholds, linearity, and prespecified functional forms—often struggle to represent gradual transitions, ambiguous diagnostic boundaries, and context-dependent relationships that characterize real-world clinical data [1-3].
Causal inference emerged as a methodological approach for estimating the effects of exposures or interventions on health outcomes, explicitly addressing the limitations of purely associational analyses [4-10]. Rather than focusing on prediction, this approach centers on counterfactual questions—what would be expected to occur under hypothetical alterations in treatment or exposure—by making causal assumptions explicit and, in principle, empirically assessable [911-13undefinedundefined]. This perspective is particularly relevant in health care and public health, where randomized controlled trials are frequently impractical and observational data constitute the primary source of evidence [9111314]. In such settings, causal reasoning is commonly formalized using directed acyclic graphs, which encode assumptions about causal structure, confounding, and intervention pathways, thereby enabling principled identification of causal effects [15-17].
Recent developments have emphasized the central role of explicit study design in strengthening causal inference from observational data. Target trial emulation (TTE) clarifies the causal question by prespecifying the key protocol components of a hypothetical randomized trial—including eligibility criteria, treatment strategies, time zero, follow-up, and outcomes—prior to analysis, thereby aligning observational studies with the core principles of randomized experiments [17-20]. While this design-oriented framework can reduce avoidable biases and enhance interpretability, it does not by itself ensure valid effect estimation. In practice, TTE still requires appropriate identification assumptions and estimation strategies and may remain vulnerable to challenges such as model misspecification or limited flexibility when representing complex data structures [17].
Despite advances in causal inference and design-oriented approaches, substantial challenges persist at the estimation stage when analyzing complex health care data. Even when causal questions are explicitly defined, commonly used estimation methods often rely on inflexible functional assumptions, sharply delineated variables, and correctly specified models—conditions that are difficult to sustain in high-dimensional and heterogeneous clinical environments [3811]. As a result, a methodological gap remains between rigorously specified causal designs and the flexible representation of nonlinearity, gradual clinical thresholds, and uncertainty inherent in observational health data, limiting the applicability of traditional causal models in complex real-world settings [12111421].
As a response to the demand for flexible representations of uncertainty and nonlinearity, fuzzy logic has been adopted as a modeling paradigm in health care research, grounded in earlier theoretical developments on vagueness and graduality. Central to this evolution was Zadeh’s introduction of fuzzy sets as a generalization of classical set theory, in which membership is defined by degrees rather than binary inclusion [22-24]. This formulation provided a formal mathematical basis for representing ambiguity, partial truth, and gradual transitions in complex systems, thereby enabling the representation of phenomena that cannot be adequately captured using crisp categories. Building on this foundation, subsequent developments extended fuzzy sets into operational fuzzy logic systems, particularly through rule-based inference mechanisms that support reasoning with linguistic variables and imprecise conditions [25].
In clinical contexts, this representational flexibility facilitates the translation of gradual and linguistically defined clinical concepts into implementable computational models. Building on these foundations, a range of fuzzy logic–based approaches—including fuzzy inference systems (FIS), adaptive neuro-fuzzy inference systems (ANFIS), fuzzy cognitive maps (FCM), Takagi-Sugeno models, and fuzzy clustering—have been applied across diverse health care domains. These applications span infectious diseases [26-34], cardiology [35-42], oncology [43-49], obstetrics [50-52], mental health [5354], and occupational health and safety [55-59].
More recently, fuzzy logic–based approaches have increasingly been combined with machine learning and artificial intelligence techniques to enhance predictive performance, scalability, and automation in health care applications [6061]. Despite this growing convergence, the extent to which such hybrid models explicitly engage with causal reasoning—through the definition of counterfactual estimands, formal identification strategies, and transparent causal assumptions—remains inconsistently reported in the literature. Against this background, this systematic review aimed to evaluate and synthesize evidence on the application of fuzzy logic–based approaches for causal inference in health care.
Methods
Design and Reporting Standards
The review was conducted following established systematic review standards, with adaptations to accommodate computational health modeling studies [62]. Eligibility criteria and search strategy used a modified PICO framework, targeting complex datasets (Population), fuzzy logic for causal inference (Intervention), conventional modeling methods (Comparator), and performance or interpretability outcomes (Outcomes).
The review was conducted and reported in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 guidelines to ensure transparency and methodological rigor [63]. A systematic search was performed across 6 bibliographic databases, with references managed using Zotero (v7.0.15) and blinded title-abstract and full-text screening conducted in Rayyan (Qatar Computing Research Institute). The review protocol was prospectively registered in PROSPERO (registration number CRD420251044493). Risk of bias was assessed using the Joanna Briggs Institute (JBI) checklist for analytical cross-sectional studies [64] and the PROBAST+AI tool for predictive modeling studies [65]. The procedures applied at each stage of the review are described in detail below.
This systematic review addresses a critical gap in the literature regarding how fuzzy logic–based approaches have been used to support causal inference in health care. The primary research question guiding the review was: How have fuzzy modeling approaches been applied, alone or in combination with other methods, to address causal questions in complex, multivariable health care settings? Rather than testing superiority, the review aimed to examine the contexts, modeling strategies, and assumptions under which fuzzy logic–based methods have been used in relation to causal objectives, particularly in settings characterized by uncertainty, nonlinearity, and high-dimensional data.
Research Questions and Scope
To structure this analysis, three secondary questions were defined: (RQ1) What modeling characteristics and design features are commonly reported in fuzzy-based approaches addressing causal questions? (RQ2) Under what data or problem contexts are fuzzy logic–based methods compared with conventional modeling approaches? (RQ3) How are the resulting insights framed in relation to clinical or policy-relevant decisions?
Eligibility Criteria
Studies were eligible for inclusion if they applied fuzzy logic–based approaches in health care settings and demonstrated either an explicit or implicit causal objective. Explicit causal intent was defined using formal causal frameworks, counterfactual reasoning, or clearly articulated intervention contrasts. Implicit causal intent was identified when modeling structures, analytical interpretations, or conclusions were framed in terms of causal effects, intervention impact, or decision-relevant implications beyond prediction. This inclusive criterion allowed the review to capture both formally specified and informally articulated causal approaches.
Studies were excluded if they did not use fuzzy-based techniques, lacked any causal objective, or focused exclusively on diagnostic classification or prediction without causal interpretation.
Information Sources and Search Strategy
The literature search was conducted between March and June 2025 across 6 electronic databases: PubMed, Web of Science, ScienceDirect, SpringerLink, Scopus, and IEEE Xplore. Search strategies combined controlled vocabulary terms (eg, MeSH [Medical Subject Headings] in PubMed) with platform-specific free-text keywords to capture studies at the intersection of fuzzy logic, causal inference, and health care. To enhance sensitivity, the search was intentionally broad and complemented by manual screening of reference lists from included studies. The full search strategies for each database are reported in Table 1.
Study Selection Process
Eligible studies were required to be peer-reviewed, published in English between 2014 and 2025, and to provide sufficient methodological detail to allow critical appraisal. Only original research articles with accessible full text and direct relevance to clinical or health policy decision-making were included. While the reporting of performance metrics (eg, accuracy and area under the curve, AUC) and the use of comparator methods were encouraged, their absence did not constitute grounds for exclusion when studies provided substantive contributions to fuzzy modeling or causal reasoning in health care.
Studies were excluded if they were not written in English, did not constitute original research (including narrative or systematic reviews, editorials, commentaries, or conference abstracts without full text), were published outside the predefined time frame, involved extremely small samples (fewer than 5 observations), or lacked sufficient methodological transparency to support reproducibility or critical appraisal. These criteria were applied to ensure inclusion of studies with empirical relevance, conceptual rigor, and clarity in reporting.
Study selection was managed using Zotero and Rayyan (Qatar Computing Research Institute). Two reviewers (JJ and KC-T) independently screened titles, abstracts, and full texts according to predefined inclusion and exclusion criteria. Discrepancies were resolved through discussion with a third reviewer (PM). Interreviewer agreement was 94%, and final inclusion decisions were reached by consensus, with oversight provided by additional authors (XC-F and AV).
Data Extraction and Classification
After removal of duplicates (n=6) and clearly irrelevant records (n=390), 225 records were retained for title and abstract screening. Of these, 153 were excluded based on predefined inclusion criteria, leaving 72 full-text articles assessed for eligibility. No reports were lost during retrieval. Thirty-five full-text articles were excluded, most commonly due to publication outside the predefined time frame (n=27), as well as wrong population (n=2), wrong outcome (n=3), wrong publication type (n=1), or wrong study design (n=2). A total of 37 studies were included in the final synthesis.
Following the inclusion of 37 studies, a structured data extraction process was implemented to ensure consistency while accommodating methodological heterogeneity. Two reviewers independently extracted data using a piloted extraction form, with discrepancies resolved through consensus or, when necessary, third-party adjudication. The extraction framework was designed to capture both technical modeling features and elements relevant to causal framing and interpretation.
Evidence Synthesis
Extracted variables were organized across four domains: (1) bibliographic and contextual information (author, year, journal, and health care domain); (2) data characteristics (data source, dataset type, and sample size); (3) modeling and analytical features, including fuzzy modeling framework (eg, FIS, FCM, neuro-fuzzy, and Takagi-Sugeno), comparator methods (eg, generalized linear models, structural equation models, and directed acyclic graph–informed analyses), and reported performance metrics (eg, accuracy, AUC, and root mean square error); and (4) elements related to causal framing, including stated causal assumptions, interpretability features, and reported clinical or policy implications.
To support consistency across studies and reduce terminological heterogeneity, ELICIT—an artificial intelligence–assisted evidence synthesis platform—was used to standardize extracted terminology, assist in the classification of modeling approaches, and check internal coherence of extracted items. ELICIT was used as a supportive tool for data organization and synthesis and did not replace reviewer judgment in data extraction or interpretation.
Risk of Bias Assessment
Risk of bias was assessed using two complementary tools, selected according to the methodological design of each included study. The JBI checklist [64] was applied to studies with observational or associational designs, particularly those using structural causal reasoning without formal identification strategies. Studies focused on predictive model development or validation were evaluated using the PROBAST+AI tool [65].
Tool-specific criteria guided the assessment of potential bias. For studies evaluated with PROBAST+AI, emphasis was placed on outcome definition, predictor handling, and analytical transparency. For studies assessed using the JBI checklist, particular attention was given to reporting adequacy, conceptual rigor, and overall methodological clarity. These assessments identified recurrent limitations related to internal validity and reporting quality across the included evidence.
Overall risk of bias was classified as low, moderate, or high based on the severity and frequency of methodological concerns identified using each assessment tool. Given the substantial methodological heterogeneity of the included studies, a formal GRADE (Grading of Recommendations Assessment, Development, and Evaluation) assessment was not performed. Instead, the certainty of the evidence was appraised qualitatively by triangulating risk of bias assessments, methodological coherence, and robustness of reporting.
Data Synthesis and Analytical Strategy
Following data extraction, findings were synthesized to characterize modeling approaches, identify comparative trends, and highlight evidence gaps at the intersection of fuzzy logic and causal inference. Descriptive statistics were used to summarize the distribution of included studies by health care domain, modeling approach, data source, and sample size, with frequencies and proportions calculated for fuzzy modeling techniques, data types (real-world or synthetic), and reported performance metrics.
In parallel, a thematic analysis was conducted across 4 focal areas: diversity of modeling frameworks, comparative performance, treatment of causal assumptions, and relevance to clinical or policy decision-making. This analytical phase also identified recurring methodological limitations, reporting inconsistencies, and underexplored applications, thereby complementing the structured risk of bias assessments. Summary tables were used to support structured comparison across studies and ensure consistent classification by health care domain, fuzzy modeling technique, comparator method, and reported performance metrics.
Results
During the systematic search conducted between March and June 2025, a total of 621 records were identified across 6 electronic databases. After removal of duplicates and screening of titles, abstracts, and full texts, 37 studies published between 2014 and 2025 met the inclusion criteria and were retained for final synthesis. The PRISMA 2020 flow diagram (Figure 1) details the study selection process, including the number of records screened, excluded, and included at each stage of the review.
Table 2 lists fuzzy logic–based methodologies from 37 studies, showing the literature’s methodological diversity. The table defines each approach, its main analytical role, common application in reviewed studies, and frequency of use.
Table 3 summarizes the 37 studies included in the final synthesis, spanning health care domains such as infectious diseases, cardiovascular conditions, cancer, mental health, occupational health, and preterm birth. Across studies, the most frequently applied fuzzy approaches were FIS (Mamdani type), ANFIS, fuzzy analytic hierarchy process (FAHP) variants, and FCM. Sample sizes varied widely, ranging from fewer than 100 participants to large-scale public datasets exceeding 1000 cases, with data sources including institutional or hospital records, expert-based judgments, and simulated data.
Fourteen studies used direct comparator methods, most commonly logistic regression, decision trees, or ensemble classifiers, whereas the remaining studies relied on baseline comparisons or did not include external benchmarks. Performance was typically reported using accuracy or AUC, with sensitivity and specificity included in selected cases. Importantly, only a minority of studies explicitly framed their analyses within formal causal inference paradigms, while most remained primarily predictive or associative in scope.
The temporal distribution of the included studies shows an uneven pattern over the past decade, with episodic increases rather than a steady growth trajectory, culminating in a pronounced peak in 2024 (Figure 2).
Flow diagram illustrating the identification, screening, eligibility assessment, and inclusion of studies in the systematic review, according to PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) 2020 guidelines [63].
Chronological distribution of included investigations (2015‐2025). The bar chart illustrates the annual number of studies published throughout the review period.
The most frequently addressed conditions were infectious diseases (n=10) [26-3470undefinedundefinedundefinedundefinedundefinedundefinedundefinedundefined], cardiovascular diseases (n=7) [353638-42undefinedundefinedundefinedundefined], cancer (n=7) [43-49], occupational health and safety (n=5) [55-59], mental health (n=2) [5354], and preterm birth (n=2) [5051]. Additional studies fell into miscellaneous categories [666869].
Regarding data sources, most studies used institutional or hospital datasets (n=18) [3135364144-485051555666-71undefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefined], while 9 relied on public datasets [28-30323338404957undefinedundefined], 5 reported expert-based data [2734395459], 3 reported mixed sources [265358], and 2 used simulated or synthetic data [4243].
Sample sizes varied considerably across studies: 13 used large datasets (n≥1000) [2628-3236384546516768undefinedundefinedundefinedundefined], another 10 relied on medium-sized samples (n=100 to 999) [35404144495053586669], 10 used small datasets (n<100) [3443474854-57596870undefinedundefinedundefined], and in 4 studies, the sample size was not applicable [27333942].
A broad array of fuzzy logic techniques was identified across the included studies, reflecting substantial methodological heterogeneity. The most commonly used methods were FIS and their variations (n=8) [2629333555565870], frequently implemented using Mamdani-type structures; followed by the FAHP (n=6) [263132395067], FCM (n=5) [284344546669], adaptive neuro-fuzzy systems (n=3) [414548], typically combining neural architectures with fuzzy rule bases for improved learning capacity and hybrid fuzzy approach combined with multicriteria decision models (n=3) [273857].
Other used models were fuzzy clustering (C or K means) (n=2) [5167], fuzzy-trace theory (n=2) [4649], fuzzy-set qualitative comparative analysis (fsQCA) (n=2) [3053], Takagi-Sugeno models (n=1) [66], fuzzy failure mode and effects analysis (n=1) [59], mediative fuzzy logic (n=1) [42], fuzzy evidential reasoning (n=1) [34], likelihood-fuzzy analysis (n=1) [47], and profile-based fuzzy association rule mining (n=1) [40].
Among the studies reviewed, 14 conducted direct comparative evaluations against traditional methods such as logistic regression, decision trees, or standard statistical models [2839-4143-4547485759666869undefinedundefinedundefinedundefined] and 6 studies used baseline comparisons, typically involving simple pre/post assessments without an external benchmark [263132465056]. In contrast, 17 studies applied fuzzy modeling in isolation, without any form of benchmarking or comparator method, relying solely on internal outputs to assess performance [27293033-363842495153-55586770undefinedundefinedundefinedundefinedundefined]
Between the studies that conducted direct comparative evaluations, 5 reported that fuzzy logic models outperformed traditional methods, including statistical classifiers and machine learning algorithms. These included Mahmoodi et al [44], who achieved 95.8% accuracy in gastric cancer prediction using FCM; Yılmaz et al [45], who obtained 94.6% accuracy with a neuro-fuzzy model for lung cancer; Subramanian et al [43], who reported 94.3% overall accuracy using a layered FCM for breast cancer risk; Sabahi [39], who introduced a bimodal FAHP model with accuracies above 85%; and Saleh et al [41], whose ANFIS classifier outperformed other ensemble models in diabetic retinopathy detection.
Three studies showed that fuzzy models yielded comparable or slightly superior performance relative to conventional methods. Argyropoulos et al [68] reported equivalent AUC values for both fuzzy logic and logistic regression models in predicting acute kidney injury, while Pota et al [47] found similar predictive accuracy between likelihood-fuzzy analysis and naïve Bayes classifiers in radiotherapy toxicity. Stanković and Stanković [48] also demonstrated that a neuro-fuzzy system marginally outperformed an artificial neural network in predicting prostate cancer survival.
The remaining 6 studies—Amirkhani et al [66], Yavari et al [40], Mohandes et al [57], Benito et al [28], Sümbül-Şekerci et al [69], and Demir and Sabır [59]—involved direct comparisons but did not report sufficient methodological or statistical detail to clearly assess the relative effectiveness of the fuzzy approach. To visually summarize the comparative performance of fuzzy logic models versus conventional statistical approaches, Figure 3 presents reported accuracy values from studies that provided quantifiable metrics. Only those studies with explicit accuracy comparisons were included, enabling a focused assessment of relative predictive performance across diverse health care domains.
Across these studies, common performance metrics included accuracy (84%‐95.8%), AUC (0.70‐0.95), and error measures such as root mean square error, mean absolute error, and mean squared error. These results underscore the adaptability of fuzzy modeling to clinical decision-making contexts marked by uncertainty, incomplete data, and the need for interpretability.
In terms of causal inference, conceptual approaches varied across the studies. While most of the studies addressed high-complexity settings involving multiple interacting variables, only two explicitly adopted formal causal inference frameworks. These included Lee et al [30], who used fsQCA with sufficiency and necessity thresholds; and Mohandes et al [57], who implemented a hybrid interval-valued intuitionistic fuzzy DEMATEL-ANP (decision-making trial and evaluation laboratory analytic network process) model with cross-validation.
Six additional studies [283844535466] simulated causal mechanisms using methods such as iterative expert-based system mapping or FCM. However, none of these studies explicitly operationalized a formal causal inference framework grounded in counterfactual reasoning or directed acyclic graphs. Instead, causal assumptions were inferred through expert consensus or embedded in the structure of fuzzy systems.
The remaining 29 studies used fuzzy logic primarily for predictive or associative analysis [26272931-3639-4345-515556585967-70undefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefinedundefined], with causal relationships often left implicit, untested, or loosely derived from domain-specific knowledge alone.
Table 3 provides a detailed synthesis of the 14 studies that directly compare fuzzy logic models with traditional statistical or machine learning methods. Most of these studies reported performance gains for fuzzy approaches, particularly in cancer [43-45] and cardiovascular domains [3941]. In several cases, fuzzy models offered not only higher accuracy or sensitivity but also enhanced interpretability. Others showed broadly comparable results with added value in robustness [474868]. A smaller group reported either mixed outcomes or limited statistical detail, emphasizing interpretability and methodological novelty over raw predictive gains [284057596669].
Reported accuracy values from studies performing direct quantitative comparisons between fuzzy logic–based models and conventional approaches [394143-454768undefinedundefined].
Collectively, the evidence summarized in Table 4 indicates that fuzzy logic–based approaches have been evaluated against conventional methods in a limited subset of studies, yielding heterogeneous results and variable reporting quality. Although several comparative assessments suggest potential advantages in managing uncertainty and enhancing interpretability, the absence of systematic benchmarking and the predominance of predictive objectives preclude definitive conclusions regarding comparative effectiveness.
While the findings indicate that fuzzy logic–based approaches are frequently applied in predictive health care modeling, the strength of the available evidence must be interpreted considering methodological quality and risk of bias. Seventeen studies were evaluated using the PROBAST+AI tool, specifically designed for assessing bias in prediction model studies [65]. Of these, 9 were rated as having a high risk of bias [273435394344476667], and 8 studies were rated as moderate risk [2833404145486869], standing out for more robust validation procedures, detailed variable handling, and partial transparency. None achieved a low-risk rating.
Of the 20 studies assessed using the JBI checklist [64] for analytical cross-sectional designs, 5 were rated as having low risk of bias [2930464953], while the remaining 15 [263132363842505154-596870undefinedundefinedundefinedundefinedundefined] were classified as moderate risk (Figure 4).
To examine the distribution of fuzzy logic techniques across health care applications, a cross-tabulated synthesis was conducted. As shown in Figure 5, the most frequently applied approaches were FIS, FCM, and ANFIS, followed by FAHP, fsQCA, fuzzy evidential reasoning, and Takagi-Sugeno models. The use of these techniques varied across application domains, with oncology, infectious diseases, cardiovascular health, and mental health exhibiting the highest methodological diversity.
The nature of causal engagement across the included studies spanned a continuum from explicitly formalized causal frameworks to approaches in which causal reasoning remained implicit or embedded within expert-driven or structurally defined fuzzy models. Only two studies (2/37, 5.4%) explicitly addressed causal questions using formal causal inference methodologies. A small subset relied on inferred causal structures derived from expert knowledge or fuzzy cognitive maps (6/37, 16.2%). In contrast, most studies primarily implemented predictive or associative modeling approaches, where causal interpretation was not formally specified and was instead inferred indirectly from model structure, expert judgment, or post hoc interpretation (29/37, 78.4%). This distribution highlights substantial heterogeneity in how causal principles are operationalized across fuzzy logic–based applications in health care.
Risk of bias across included studies assessed using PROBAST+AI and the JBI checklist. Among studies evaluated with PROBAST+AI, 53% (9/17) were classified as high risk and 47% (8/17) as moderate risk, with none rated as low risk. In contrast, among studies assessed using the JBI checklist, 25% (5/20) were rated as low risk and 75% (15/20) as moderate risk. JBI: Joanna Briggs Institute; PROBAST: Prediction Model Risk of Bias Assessment Tool.
Distribution of fuzzy logic techniques across health care domains. ANFIS: adaptive neuro-fuzzy inference system; FAHP: fuzzy analytic hierarchy process; FCM: fuzzy cognitive map; FER: fuzzy evidential reasoning; FIS: fuzzy inference system; fsQCA: fuzzy-set qualitative comparative analysis; FTT: fuzzy-trace theory; MCDM: multicriteria decision-making; TS: Takagi-Sugeno model.
Discussion
This systematic review synthesized evidence from 37 studies published between 2014 and 2025 that used fuzzy logic–based methodologies in health care settings with explicit or implicit causal objectives. The included studies span a wide range of clinical and public health domains, including infectious diseases, cancer, cardiovascular diseases, occupational health and safety, mental health, and preterm birth, underscoring the broad applicability of fuzzy modeling to diverse health-related problems. Across domains, the most frequently reported approaches were FIS, ANFIS, FAHP, and FCM. Rather than indicating methodological convergence, this distribution reflects context-dependent adaptations of fuzzy logic to address uncertainty, nonlinearity, and expert-guided reasoning in complex health care environments.
Only a limited subset of studies conducted direct comparative evaluations between fuzzy logic–based models and conventional statistical or machine learning approaches. Among the 14 studies that included explicit comparators, 5 reported superior performance of fuzzy models [394143-45undefinedundefined]—most frequently in cancer and cardiovascular applications—while 3 demonstrated broadly comparable results [474868]. The remaining 6 studies provided comparative analyses with insufficient methodological or statistical detail to support firm conclusions regarding relative effectiveness [28405759666769]. Importantly, most included studies relied on internal validation procedures, baseline comparisons, or expert-defined structures without external benchmarks, often using small- to medium-sized datasets. This pattern limits the generalizability of reported performance gains and indicates that, while fuzzy approaches may perform competitively in specific contexts characterized by nonlinearity or uncertainty, evidence supporting consistent superiority over conventional methods remains limited and heterogeneous.
Causal inference was explicitly operationalized in only a small proportion of the included studies. Specifically, two investigations adopted formal causal inference frameworks: Lee et al [30] used fsQCA, explicitly modeling configurations of necessary and sufficient conditions at the population level. Mohandes et al [57] implemented a hybrid interval-valued intuitionistic fuzzy DEMATEL-ANP approach to structurally identify and prioritize causal drivers in occupational safety systems. In both cases, causal claims were grounded in transparent methodological procedures, explicit thresholds, and internally coherent validation strategies, rather than inferred post hoc from predictive performance.
Beyond these two studies, causal reasoning was indirect. Six additional investigations relied on expert-based mappings, FCM, or influence structures to simulate causal mechanisms without formally testing necessity, sufficiency, or counterfactual dependence [28384453546669]. In most studies, fuzzy logic was applied primarily for predictive or associative purposes, with causal assumptions embedded implicitly within model architecture or domain expertise rather than explicitly articulated or empirically evaluated.
Taken together, these findings reveal a marked disconnect between the theoretical capacity of fuzzy logic to represent causal structure and its prevailing empirical use in health care research. This gap appears to reflect not inherent conceptual limitations of fuzzy methods, but rather broader issues related to study design, validation practices, and reporting rigor, which constrain the translation of fuzzy modeling from predictive decision support to explicit causal inference.
Risk of bias constituted a major limiting factor across the included studies. Among those evaluated with PROBAST+AI [65], none achieved a low-risk rating, with most classified as moderate or high risk, while only a small proportion of studies assessed using the JBI checklist [64] were rated as low risk.
In contrast to much of the existing literature, which has primarily emphasized predictive accuracy or isolated clinical applications, the present review integrates formal risk-of-bias assessment with thematic synthesis to jointly evaluate reported performance, methodological rigor, and the explicitness of causal assumptions. This perspective highlights both the strengths and current limitations of fuzzy logic–based approaches: while they provide interpretable, rule-based models well suited to ambiguity and nonlinearity, their application within explicitly causal analytical frameworks remains limited and inconsistent.
These conclusions must be interpreted considering several important limitations, including substantial heterogeneity across health care domains, modeling strategies, and outcome measures, which precluded quantitative meta-analysis; inconsistent reporting practices, such as limited use of comparator models and incomplete outcome reporting; and the frequent reliance on small- to medium-sized datasets without external validation. Collectively, these factors reduce the overall certainty and generalizability of the current evidence base.
Despite these limitations, the findings carry important implications for both research and practice. Fuzzy systems appear particularly well suited to health care and policy contexts characterized by incomplete data, multidimensional interactions, and a strong demand for interpretability. Their capacity to encode expert knowledge and tolerate imprecision supports their use in applications such as risk stratification, early diagnosis, and context-sensitive prioritization. Realizing this potential, however, will require methodological consolidation, including greater standardization in reporting, more consistent use of comparator frameworks, and external validation across real-world datasets. Importantly, integration with formal causal frameworks—such as directed acyclic graphs or structural causal models—offers a pathway to strengthen causal interpretability while preserving the distinctive advantages of fuzzy reasoning.
In parallel, recent advances in artificial intelligence have largely emphasized the automation of data extraction, measurement, and pattern recognition in clinical settings, particularly through machine learning and computer vision–based applications [71-76]. While these approaches have improved efficiency and scalability, they remain predominantly oriented toward prediction rather than causal inference. Addressing this gap requires analytical frameworks that move beyond automation to explicitly represent causal structure, intervention contrasts, and temporal assumptions.
In this context, future research would benefit from explicitly incorporating TTE [17-20] when applying fuzzy logic to observational health care data. TTE provides a principled framework for specifying causal estimands, temporal ordering, and hypothetical interventions, thereby addressing key sources of bias that remain unresolved in many fuzzy-based applications. By defining eligibility criteria, treatment strategies, follow-up periods, and causal contrasts a priori, TTE can situate fuzzy, rule-based models within transparent causal designs—an approach that is particularly relevant in real-world health care settings where randomized trials are often infeasible.
Viewed in this way, fuzzy logic should not be considered merely an auxiliary modeling technique, but a potential component of hybrid causal approaches in health care. When interpretability and causal structure are integrated into model design rather than treated as secondary considerations, fuzzy systems may help bridge the gap between statistical prediction and meaningful causal explanation. Advancing this agenda will require further methodological refinement, interdisciplinary collaboration, and a move toward more coherent and explicitly causal research programs in complex health systems.
Supplementary material
10.2196/83425Checklist 1PRISMA 2020 checklist.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Seoni S Jahmunah V Salvi M Barua PD Molinari F Acharya UR Application of uncertainty quantification to artificial intelligence in healthcare: a review of last decade (2013-2023)Comput Biol Med Oct 2023165107441 doi 10.1016/j.compbiomed.2023.107441 Medline 37683529 · doi ↗ · pubmed ↗
- 2Alizadehsani R Roshanzamir M Hussain S et al Handling of uncertainty in medical data using machine learning and probability theory techniques: a review of 30 years (1991-2020)Ann Oper Res Mar 2120213393142 doi 10.1007/s 10479-021-04006-2Medline 33776178 PMC 7982279 · doi ↗ · pubmed ↗
- 3Liu F Data science methods for real-world evidence generation in real-world data Annu Rev Biomed Data Sci Aug 202471201224 doi 10.1146/annurev-biodatasci-102423-113220 Medline 38748863 · doi ↗ · pubmed ↗
- 4Pearl J Causality: Models, Reasoning and Inference 2nd Cambridge University Press 2009 ISBN 9780521895606
- 5Pearl J An introduction to causal inference Int J Biostat Feb 26201062 doi 10.2202/1557-4679.1203 Medline 20305706 PMC 2836213 · doi ↗ · pubmed ↗
- 6Pearl J On the consistency rule in causal inference: axiom, definition, assumption, or theorem?Epidemiology Nov 2010216872875 doi 10.1097/EDE.0b 013e 3181 f 5d 3fd Medline 20864888 · doi ↗ · pubmed ↗
- 7Pearl J Graphical models, potential outcomes and causal inference: comment on Linquist and Sobel Neuroimage Oct 12011583770771 doi 10.1016/j.neuroimage.2011.06.007Medline 21699988 · doi ↗ · pubmed ↗
- 8Rubin DB Formal mode of statistical inference for causal effects J Stat Plan Inference 071990253279292 doi 10.1016/0378-3758(90)90077-8 · doi ↗
