BDIViz: An Interactive Visualization System for Biomedical Schema Matching with LLM-Powered Validation
Eden Wu, Dishita G Turakhia, Guande Wu, Christos Koutras, Sarah Keegan, Wenke Liu, Beata Szeitz, David Fenyo, Cl\'audio T. Silva, and Juliana Freire

TL;DR
BDIViz is an interactive visualization system that enhances biomedical schema matching accuracy and efficiency by integrating multiple matching methods with LLM-based validation and user-friendly visualizations.
Contribution
The paper introduces BDIViz, a novel visual analytics system that combines ensemble schema matching with LLM validation to improve biomedical data harmonization.
Findings
BDIViz significantly improves matching accuracy.
Reduces cognitive load and curation time.
Effective in biomedical case studies.
Abstract
Biomedical data harmonization is essential for enabling exploratory analyses and meta-studies, but the process of schema matching - identifying semantic correspondences between elements of disparate datasets (schemas) - remains a labor-intensive and error-prone task. Even state-of-the-art automated methods often yield low accuracy when applied to biomedical schemas due to the large number of attributes and nuanced semantic differences between them. We present BDIViz, a novel visual analytics system designed to streamline the schema matching process for biomedical data. Through formative studies with domain experts, we identified key requirements for an effective solution and developed interactive visualization techniques that address both scalability challenges and semantic ambiguity. BDIViz employs an ensemble approach that combines multiple matching methods with LLM-based validation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
