Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery
Long Bai, Mobarakol Islam, Hongliang Ren

TL;DR
This paper introduces a novel continual learning framework for surgical visual-question answering that effectively balances learning new information and retaining old knowledge, addressing challenges like catastrophic forgetting and data privacy.
Contribution
It proposes rigidity-plasticity-aware distillation and self-calibrated heterogeneous distillation techniques, along with weight aligning, to improve continual learning in surgical VQLA systems.
Findings
Outperforms conventional CL methods in surgical VQLA tasks
Effectively mitigates catastrophic forgetting in continual learning
Demonstrates robustness across multiple surgical datasets
Abstract
The visual-question localized-answering (VQLA) system can serve as a knowledgeable assistant in surgical education. Except for providing text-based answers, the VQLA system can highlight the interested region for better surgical scene understanding. However, deep neural networks (DNNs) suffer from catastrophic forgetting when learning new knowledge. Specifically, when DNNs learn on incremental classes or tasks, their performance on old tasks drops dramatically. Furthermore, due to medical data privacy and licensing issues, it is often difficult to access old data when updating continual learning (CL) models. Therefore, we develop a non-exemplar continual surgical VQLA framework, to explore and balance the rigidity-plasticity trade-off of DNNs in a sequential learning paradigm. We revisit the distillation loss in CL tasks, and propose rigidity-plasticity-aware distillation (RP-Dist) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
