Multimodal Graph Representation Learning for Robust Surgical Workflow Recognition with Adversarial Feature Disentanglement
Long Bai, Boyi Ma, Ruohan Wang, Guankun Wang, Beilei Cui, Zhongliang, Jiang, Mobarakol Islam, Zhe Min, Jiewen Lai, Nassir Navab, Hongliang Ren

TL;DR
This paper introduces a multimodal graph neural network with adversarial feature disentanglement for robust surgical workflow recognition, effectively handling data corruption and domain shifts by integrating vision and kinematic data.
Contribution
The paper proposes a novel multimodal graph network with adversarial training and contextual decoding to improve robustness in surgical workflow recognition under adverse conditions.
Findings
Enhanced accuracy in challenging scenarios
Robustness against data corruption during storage and transmission
Superior performance demonstrated through extensive experiments
Abstract
Surgical workflow recognition is vital for automating tasks, supporting decision-making, and training novice surgeons, ultimately improving patient safety and standardizing procedures. However, data corruption can lead to performance degradation due to issues like occlusion from bleeding or smoke in surgical scenes and problems with data storage and transmission. In this case, we explore a robust graph-based multimodal approach to integrating vision and kinematic data to enhance accuracy and reliability. Vision data captures dynamic surgical scenes, while kinematic data provides precise movement information, overcoming limitations of visual recognition under adverse conditions. We propose a multimodal Graph Representation network with Adversarial feature Disentanglement (GRAD) for robust surgical workflow recognition in challenging scenarios with domain shifts or corrupted data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Artificial Intelligence in Healthcare and Education · Advanced X-ray and CT Imaging
MethodsALIGN
