Artificial intelligence in the diagnosis and management of dysphagia: a scoping review
Rayane Délcia da Silva, Suzanne Bettega Almeida, Flávio Magno Gonçalves, Bianca Simone Zeigelboim, José Stechman-Neto, Angela Graciela Deliga Schroder, Weslania Viviane Nascimento, Rosane Sampaio Santos, Cristiano Miranda de Araujo

TL;DR
This review explores how artificial intelligence is being used to diagnose and manage dysphagia, focusing on the potential of deep learning techniques.
Contribution
The study maps technological advancements in AI for dysphagia diagnosis and highlights the predominant use of deep learning methods.
Findings
Deep Learning is predominantly used in AI applications for dysphagia diagnosis.
Videofluoroscopy is the most common reference examination in these studies.
Neurological conditions are the most prevalent among patients in the reviewed studies.
Abstract
This scoping review aimed to map and synthesize evidence on technological advancements using Artificial Intelligence in the diagnosis and management of dysphagia. We followed the PRISMA guidelines and those of the Joanna Briggs Institute, focusing on research about technological innovations in dysphagia. The protocol was registered on the Open Science Framework platform. The databases consulted included EMBASE, Latin American and Caribbean Health Sciences Literature (LILACS), Livivo, PubMed/Medline, Scopus, Cochrane Library, Web of Science, and grey literature. The acronym 'PCC' was used to consider the eligibility of studies for this review. After removing duplicates, 56 articles were initially selected. A subsequent update resulted in 205 articles, of which 61 were included after applying the selection criteria. Videofluoroscopy of swallowing was used as the reference examination…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDysphagia Assessment and Management · Tracheal and airway disorders · Esophageal and GI Pathology
INTRODUCTION
Dysphagia, a symptom that impairs swallowing and can lead to pulmonary complications, dehydration, and malnutrition, is a growing concern in studies due to its impact on patients' quality of life and the healthcare system. It affects about 12-13% of hospitalized patients, rising to 30% in the elderly, contributing to a 47.5% increase in hospitalizations in this group, and is considered a geriatric syndrome. The prevalence can be as high as 60% in intensive care or home nursing settings, with rates varying based on associated comorbidities^(1,2)^.
Distinguishing the etiology and performing early and accurate diagnosis play a fundamental role in the prognosis of dysphagia, which is why they have been the subject of extensive research. Evaluation modalities are generally divided between clinical approaches and imaging examinations, which complement each other. However, these assessments are considered subjective, and some examinations may face accessibility issues or lack standardized protocols. Additionally, special attention must be given to the risk-benefit aspects for the patient, making it essential for this assessment to be evidence-based^(3,4)^.
Artificial Intelligence (AI) consists of a set of technologies designed to perform tasks in a manner similar to human intelligence. Intelligent agents are trained using data until they can carry out their functions autonomously. Subfields of AI include Machine Learning (ML) algorithms, which identify patterns and make predictions, and Deep Learning (DL), which is considered more complex due to its use of layered neural networks. These technologies contribute to the emergence of new hypotheses, discoveries, and task optimization in healthcare, aiming for a safer and more efficient approach^(5-8)^. With technological advancements in healthcare, artificial intelligence plays a significant role, particularly in image analysis. In the context of dysphagia, AI offers new perspectives for identifying swallowing alterations and facilitating the rehabilitation process. Therefore, this review aims to map and synthesize evidence regarding technological advancements with AI in the diagnosis and management of dysphagia.
METHODS
This comprehensive review was conducted in accordance with the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) and the recommendations for scoping reviews by the Joanna Briggs Institute^(9)^. It was registered on the Open Science Framework (OSF) platform^(10)^.
Eligibility criteria
The acronym 'PCC' was used to formulate the following research question: “What is the evidence regarding technological advancements involving artificial intelligence in the diagnosis and management of dysphagia?” This acronym was also applied to determine the eligibility criteria for studies included in this review, representing:
P = Population (Humans of any age group);C = Concept (Use of Artificial Intelligence);C = Context (Aid in the treatment and diagnosis of dysphagia).
Inclusion criteria
To map studies with a higher level of evidence, only primary and analytical studies were included, such as clinical trials, cohorts, case-control studies, cross-sectional, prospective, or retrospective studies, which used AI in the evaluation or treatment of dysphagia. There were no restrictions regarding the gender, ethnicity of individuals, language of studies, publication date, and diagnosis.
Exclusion criteria
The following exclusion criteria were applied: a) animal studies; b) studies without any use of technology and/or innovation involving AI; c) studies without dysphagia management; d) reviews, case reports, case series, personal opinions, letters, posters, and conference abstracts.
Information sources and search
Word combinations were adapted for each of the seven selected electronic databases as sources for the search, namely: EMBASE, Latin American and Caribbean Health Sciences Literature (LILACS), LIVIVO, PubMed/Medline, Scopus, Cochrane Library, and Web of Science. Additionally, grey literature was also used as a source of information through AshaWire, Google Scholar (100 most relevant results), and ProQuest Dissertations & Theses Global (Appendix A).
Searches in electronic databases and grey literature were conducted on October 27, 2022, and an update was performed on November 3, 2023. All references were managed, and all duplicate studies were removed using appropriate software (EndNote® X7 Thomson Reuters, Philadelphia, PA). The reference lists of all included articles were checked using the web application Citation Chaser^(11)^, searching for both the citations used by these studies and the articles that cited them.
Selection of sources of evidence
Article selection was carried out in two phases. In the first phase, two reviewers (R.D.S and S.B) independently reviewed the titles and abstracts of all references. All articles that did not meet the pre-established criteria were excluded at this stage. In the second phase, the same reviewers independently read the full text of the articles selected in the first phase. When there was no consensus even after discussion, a third reviewer (R.S) was involved for the final decision.
To facilitate independent reading, the Rayyan website^(12)^ was used. In addition to the two reviewers who conducted blind assessments, a third team member (C.A) acted as a moderator.
Data charting process and data items
The collected data consisted of study characteristics (author, year of publication), population characteristics (age and pathology), algorithms and AI techniques used, model evaluation metrics, and outcomes.
If the necessary data were incomplete, efforts were made to contact the authors to obtain unpublished data. Authors could be contacted via email for three consecutive weeks in search of more information.
All relevant information was extracted and mapped, with extraction performed by the two main reviewers, followed by final data verification using the Bing AI tool^(13)^. As this is a descriptive review, any measures of effect were considered and used in the qualitative synthesis.
Reporting bias
To reduce the likelihood of reporting bias, a comprehensive search strategy was conducted through seven electronic databases, including a non-English language database (LILACS). Additionally, a search of grey literature was also conducted to check for the existence of studies meeting eligibility criteria but not yet published.
RESULTS
Selection of sources of evidence
The flow of studies through the scoping review process is presented in Figure 1. A total of 1.225 articles were retrieved from seven electronic databases. After removing duplicates, 1.012 references remained. Subsequently, 948 studies were excluded based on eligibility criteria. Four articles could not be located even after contacting the authors. A search of grey literature, reference lists, and an update of the databases on November 3, 2023, were also conducted, resulting in 69 studies for full-text reading. After the full-text review (second phase), 8 articles were excluded (see Appendix B). Based on the established inclusion criteria, 61 studies were identified as suitable for qualitative synthesis and results mapping.
Literature search flowchart and selection criteria
Characteristics of sources of evidence
The included studies were published from 1999^(15)^ to 2023^(16-19)^. The sample sizes of the studies ranged from one^(18)^ to 3408^(16)^ participants, with ages ranging from ten months^(20)^ to 94^(21,22)^ years. Most studies utilized some form of clinical evaluation with imaging or sound examination as a comparator in the analyses or as an objective to enhance the examination for diagnosis. Videofluoroscopy swallowing study (VFSS) was utilized in studies^(16,23-28)^, with four of them concurrently using high-resolution manometry^(27,29-31)^, only two studies^(32,33)^ used fiberoptic endoscopic evaluation of swallowing (FEES), and 2 studies reported electromyography use^(3,17)^. Sound resources as an auxiliary method in evaluation were also used^(17,21,25,26,28,34-46)^. Only one study focusing on therapeutic biofeedback and without information on associated examination methodology was found^(47)^.
Regarding the underlying diseases present in the patients participating in the studies, there is a predominance of various neurological diseases, with stroke being the most cited in 12 studies^(16,18,21,22,27,32,36,40,46,48-51)^, neurodegenerative diseases like Parkinson's were present in 3 studies^(22,24,27)^, and two studies mentioned esophageal alterations^(42,52)^. Many studies did not report the population's pathology or had no applicability due to the research methodology. The algorithms used varied within the classification of Machine Learning^(2,3,16,20,25-28,30,32,35,37,39,40,42,44,45,48-50,53-62)^, Deep Learning^(17-19,21-24,29,31,33,34,36,38,41,43,46,51,52,57,63-75)^, and Computer Vision^(15,47)^ (Figure 2). Several studies have reported high accuracy in using AI and machine learning techniques for dysphagia assessment. For instance, deep learning models like U-Net and CNNs have achieved performance metrics such as F1 scores exceeding 0.9 and accuracy rates of 97.8%, indicating their robustness in detecting swallowing events and anatomical structures. Other methods, including support vector machines (SVM) and Mask-RCNN, have demonstrated high accuracy in classifying swallowing events, with metrics like sensitivity and specificity reaching over 90%. These findings emphasize the potential of AI-driven tools in improving diagnostic accuracy for dysphagia^(51,68,69,75)^.
Number of studies according to field, underlying condition, and data source
Despite the considered effective results, all highlighted the need for further studies in the area. Descriptive characteristics of all included studies were recorded in Appendix C.
Results of individual sources of evidence
Studies on AI in dysphagia primarily rely on imaging resources such as VFSS for comparative analysis due to its high reliability^(16,23-28)^. However, the images generated by the examination are still analyzed by human judgment^(38,61,63,65)^. Since the swallowing process is considered complex, each structure contributes uniquely, with the hyoid bone being one of the most studied^(21,22,50)^. VFSS, along with high-resolution manometry, has also been considered in the evaluation of pharyngeal and esophageal anatomical structures^(27,29-31)^, and in the use of electromyography, AI aims to improve signal capture and analysis quality^(3,17)^.
Evaluation using sound resources is also part of the research, considered a safe, practical, and non-invasive support, and besides assisting in evaluation, it can be used as a biofeedback therapeutic resource. Cervical auscultation, commonly used in clinical evaluation, now consists of a range of digital resources such as accelerometers, microphones, and sensors that facilitate the analysis of specific parameters. Increasingly used in research practices, they enable diagnostic clinical markers and specific analyses^(20,26,28,32,35-37,39,41-44,53,56,57,66)^.
In research, the most addressed pathologies in adults were predominantly related to the neurological area, with stroke being highlighted in several studies^(16,21,22,24,27,36,40,46,48-51)^. In the pediatric population, cerebral palsy was the most cited condition in studies focusing on this age group^(20,37,49)^. The algorithms used in the studies varied according to the needs of each research, but most of them were classified between Machine Learning and Deep Learning, with significant accuracy levels.
DISCUSSION
The integration of AI in healthcare can enhance professionals' efficiency by optimizing data management and influencing decisions^(8)^. When combined with imaging resources for real-time swallowing evaluation, it becomes possible to offer more accurate diagnoses and improve therapeutic planning for patients with dysphagia. Key studies demonstrate high performance of deep learning models, such as CNNs and Mask-RCNN, in detecting and segmenting bolus movements in VFSS with precision metrics exceeding 90% in certain frameworks (Appendix C). This highlights the potential of AI not just in diagnostics but also in automating labor-intensive aspects of analysis^(65,69)^. It was observed that most studies focus on adults and use VFSS as a reference for reliability. Neurological diseases are frequently mentioned as the primary underlying conditions, and a variety of algorithms classified as ML or DL demonstrate good performance in achieving their goals. Stroke-related dysphagia, for example, has been widely studied with algorithms like SVMs and deep neural networks demonstrating robust accuracy in predicting aspiration events and laryngeal vestibule closure^(29,32)^. This focus underscores the significant burden that neurological conditions place on clinical resources and the need for innovations to improve workflow efficiency.
A videofluoroscopic swallowing study (VFSS), considered the reference examination in swallowing assessment, is frequently cited in research. However, its use presents challenges due to radiation exposure and limited availability in some locations. Additionally, the lack of a standardized protocol and variability in training, when provided, as well as in interpretations, directly impacts diagnostic accuracy. Recent methodologies integrating VFSS with AI-powered models have shown promise in addressing these limitations, such as high-resolution segmentation of swallowing structures via Mask-RCNN achieving intersection-over-union scores of 0.71^(65)^. VFSS is used by many professionals involved in dysphagia assessment and rehabilitation as the primary tool. FEES, another reference examination, is mentioned less frequently but faces similar challenges regarding availability and patient discomfort. Although both VFSS and FEES have high sensitivity and specificity, the need for human interpretation in defining results raises questions and inspires possibilities for creating algorithms that can automate evaluation and contribute to the analysis of specific structures^(4,5,32,57,66)^. Thus, AI contributes by aiming to automate and standardize some identification and recognition processes in an objective and effective manner. The same approach applies to the assessment of the esophageal region, which is also being studied. High-resolution manometry, considered highly accurate for this anatomical area, allows for the diagnosis of esophageal motor disorders. Additionally, studies utilizing deep learning and neural network classifiers for esophageal motility have reported sensitivity metrics above 85%, offering promising diagnostic complements^(29,30)^ (Appendix C).
In addition to these technologies, the biomechanics of swallowing is extremely complex, offering various forms of interpretation and analysis. The swallowing process involves not only images but also vibrations and sounds generated by the anatomical structures. Digital tools, such as accelerometers and high-resolution cervical auscultation sensors, have also shown significant diagnostic potential, with accuracy levels reaching 98% in distinguishing safe from unsafe swallows^(57)^ (Appendix C). However, these methods often rely on imaging examinations to validate accuracy, as cervical auscultation can be affected by technical interferences and the experience of the evaluator. Despite these advancements, challenges remain regarding the generalizability of these tools across different patient populations and clinical environments. Despite these advances, challenges remain regarding the generalizability of these tools across different patient populations and clinical environments^(76)^.
The algorithms used in the research, which achieved satisfactory levels in evaluation metrics with varied results, belong to two interrelated fields of AI that play a significant role in machine learning and data-driven decision-making. Machine Learning involves identifying patterns in data, making predictions, classifying information, and making decisions based on available information. It focuses on developing algorithms and models that enable systems to “learn”. Deep Learning, on the other hand, is a subcategory of Machine Learning, distinguished by its use of deeper neural networks. This distinction is particularly relevant in tasks involving large volumes of unstructured data, such as images, audio, and text, with audio and images being the most common data types in studies^(77,78)^.
The integration of artificial intelligence in the evaluation and treatment of dysphagia holds great potential to enhance diagnostic accuracy and professional efficiency. Traditional methods, such as VFSS and FEES, face challenges related to availability and human interpretation. Machine learning and deep learning algorithms offer solutions to standardize and automate assessments, making them more objective. Research must progress to overcome the limitations of traditional methods, improving dysphagia management and patients' quality of life.
CONCLUSION
In conclusion, this study aimed to map and synthesize evidence on the integration of artificial intelligence in the diagnosis and management of dysphagia. The findings demonstrate that AI, particularly through machine learning and deep learning algorithms, offers transformative potential by improving diagnostic accuracy, standardizing evaluations, and addressing limitations of traditional methods such as VFSS and FEES. AI technologies have shown high performance in tasks like bolus movement detection, esophageal motility analysis, and the interpretation of biomechanical signals, contributing to more objective and efficient clinical workflows. However, challenges such as limited generalizability, the need for standardized protocols, and variability in clinical settings remain significant barriers to widespread adoption. The study underscores the importance of further research to validate these technologies across diverse populations and clinical environments. Addressing these gaps is essential to ensuring the ethical and effective integration of AI into routine clinical practice, ultimately enhancing the quality of care for patients with dysphagia.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Baijens LW Clave P Cras P Ekberg O Forster A Kolb GF et al European Society for Swallowing Disorders - European Union Geriatric Medicine Society white paper: oropharyngeal dysphagia as a geriatric syndrome Clin Interv Aging 2016111403142810.2147/CIA.S 10775027785002 PMC 5063605 · doi ↗ · pubmed ↗
- 2Zhao H Jiang Y Wang S He F Ren F Zhang Z et al Dysphagia diagnosis system with integrated speech analysis from throat vibration Expert Syst Appl 202220411749610.1016/j.eswa.2022.117496 · doi ↗
- 3Cuadros-Acosta J Orozco-Duque A Automatic detection of poor quality signals as a pre-processing scheme in the analysis of s EMG in swallowing Biomed Signal Process Control 20227110312210.1016/j.bspc.2021.103122 · doi ↗
- 4Martin-Harris B Canon CL Bonilha HS Murray J Davidson K Lefton-Greif MA Best practices in modified barium swallow studies Am J Speech Lang Pathol 2020292 S 1078109310.1044/2020_AJSLP-19-0018932650657 PMC 7844340 · doi ↗ · pubmed ↗
- 5SejdićE Khalifa Y Mahoney AS Coyle JL Artificial intelligence and dysphagia: novel solutions to old problems Arq Gastroenterol 202057434334610.1590/s 0004-2803.202000000-6633331470 · doi ↗ · pubmed ↗
- 6Tran KA Kondrashova O Bradley A Williams ED Pearson JV Waddell N Deep learning in cancer diagnosis, prognosis and treatment selection Genome Med 202113115210.1186/s 13073-021-00968-x 34579788 PMC 8477474 · doi ↗ · pubmed ↗
- 7Shimizu H Nakayama KI Artificial intelligence in oncology Cancer Sci 202011151452146010.1111/cas.1437732133724 PMC 7226189 · doi ↗ · pubmed ↗
- 8Lee SJ Application of artificial intelligence in the area of dysphagia J Rehabil Med 2019511219
