Revisiting Distillation for Continual Learning on Visual Question   Localized-Answering in Robotic Surgery

Long Bai; Mobarakol Islam; Hongliang Ren

arXiv:2307.12045·cs.CV·July 25, 2023

Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery

Long Bai, Mobarakol Islam, Hongliang Ren

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel continual learning framework for surgical visual-question answering that effectively balances learning new information and retaining old knowledge, addressing challenges like catastrophic forgetting and data privacy.

Contribution

It proposes rigidity-plasticity-aware distillation and self-calibrated heterogeneous distillation techniques, along with weight aligning, to improve continual learning in surgical VQLA systems.

Findings

01

Outperforms conventional CL methods in surgical VQLA tasks

02

Effectively mitigates catastrophic forgetting in continual learning

03

Demonstrates robustness across multiple surgical datasets

Abstract

The visual-question localized-answering (VQLA) system can serve as a knowledgeable assistant in surgical education. Except for providing text-based answers, the VQLA system can highlight the interested region for better surgical scene understanding. However, deep neural networks (DNNs) suffer from catastrophic forgetting when learning new knowledge. Specifically, when DNNs learn on incremental classes or tasks, their performance on old tasks drops dramatically. Furthermore, due to medical data privacy and licensing issues, it is often difficult to access old data when updating continual learning (CL) models. Therefore, we develop a non-exemplar continual surgical VQLA framework, to explore and balance the rigidity-plasticity trade-off of DNNs in a sequential learning paradigm. We revisit the distillation loss in CL tasks, and propose rigidity-plasticity-aware distillation (RP-Dist) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

longbai1006/cs-vqla
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition