MedFusionT5: Cross-Modal Attention Boosts Semantic Quality and Reduces Hallucinations in Dental AI
Hamida Abdaoui, Sabri Barbaria, Ismail Dergaa, Halil İbrahim Ceylan, Nicola Luigi Bragazzi, Andrea de Giorgio, Ridha Ben Salah, Hanene Boussi Rahmouni

TL;DR
MedFusionT5 improves dental AI reports by using cross-modal attention to enhance accuracy and reduce false information.
Contribution
Introduces MedFusionT5, a unidirectional cross-modal alignment framework that reduces hallucinations in dental AI reports.
Findings
MedFusionT5 outperformed baselines with a 122% increase in CIDEr and 320% over concatenation.
Achieved a 2.42% hallucination rate, a 39% reduction compared to coattention baselines.
Maintained high precision (0.982) and recall (0.923) across all report lengths.
Abstract
Automated dental report generation faces significant challenges in multimodal fusion, often resulting in suboptimal semantic quality and risks of hallucination, where AI generates clinically unsupported content. Current approaches that rely on simple feature concatenation or bidirectional attention mechanisms fail to effectively capture visual-textual relationships in medical imaging. This study aims to develop MedFusionT5, a unidirectional cross-modal alignment framework that (1) achieves superior clinical report quality through focused attention between visual patches and clinical text representations, and (2) ensures exceptional factual consistency by minimising hallucination rates. We implemented a novel architecture that integrates vision transformer (ViT) for patch-based visual feature extraction with Bio_ClinicalBERT for clinical text encoding. The core innovation is a…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Radiology practices and education · Misinformation and Its Impacts
