Toward Automated Regulatory Decision-Making: Trustworthy Medical Device   Risk Classification with Multimodal Transformers and Self-Training

Yu Han; Aaron Ceross; and Jeroen H.M. Bergmann

arXiv:2505.00422·cs.LG·May 2, 2025

Toward Automated Regulatory Decision-Making: Trustworthy Medical Device Risk Classification with Multimodal Transformers and Self-Training

Yu Han, Aaron Ceross, and Jeroen H.M. Bergmann

PDF

TL;DR

This paper introduces a multimodal Transformer framework with self-training for accurate and trustworthy medical device risk classification, combining textual and visual data to outperform existing methods.

Contribution

It presents a novel multimodal Transformer model with cross-attention and self-training strategies for improved medical device risk classification under limited supervision.

Findings

01

Achieved up to 90.4% accuracy and 97.9% AUROC on real-world data.

02

Self-training improved accuracy by 3.3 percentage points over standard multimodal fusion.

03

Ablation studies confirmed the benefits of cross-modal attention and self-training.

Abstract

Accurate classification of medical device risk levels is essential for regulatory oversight and clinical safety. We present a Transformer-based multimodal framework that integrates textual descriptions and visual information to predict device regulatory classification. The model incorporates a cross-attention mechanism to capture intermodal dependencies and employs a self-training strategy for improved generalization under limited supervision. Experiments on a real-world regulatory dataset demonstrate that our approach achieves up to 90.4% accuracy and 97.9% AUROC, significantly outperforming text-only (77.2%) and image-only (54.8%) baselines. Compared to standard multimodal fusion, the self-training mechanism improved SVM performance by 3.3 percentage points in accuracy (from 87.1% to 90.4%) and 1.4 points in macro-F1, suggesting that pseudo-labeling can effectively enhance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need · Support Vector Machine