Bidirectional Contrastive Split Learning for Visual Question Answering

Yuwei Sun; Hideya Ochiai

arXiv:2208.11435·cs.CV·December 12, 2023·1 cites

Bidirectional Contrastive Split Learning for Visual Question Answering

Yuwei Sun, Hideya Ochiai

PDF

Open Access 1 Video

TL;DR

This paper introduces BiCSL, a privacy-preserving split learning framework for visual question answering that enhances robustness against adversarial attacks in decentralized multi-modal data settings.

Contribution

The paper proposes Bidirectional Contrastive Split Learning (BiCSL), a novel decentralized multi-modal learning method that improves privacy and robustness for VQA tasks.

Findings

01

BiCSL outperforms centralized methods in robustness against backdoor attacks.

02

Effective self-supervised learning achieved through contrastive loss.

03

Demonstrated on five state-of-the-art VQA models with VQA-v2 dataset.

Abstract

Visual Question Answering (VQA) based on multi-modal data facilitates real-life applications such as home robots and medical diagnoses. One significant challenge is to devise a robust decentralized learning framework for various client models where centralized data collection is refrained due to confidentiality concerns. This work aims to tackle privacy-preserving VQA by decoupling a multi-modal model into representation modules and a contrastive module and leveraging inter-module gradients sharing and inter-client weight sharing. To this end, we propose Bidirectional Contrastive Split Learning (BiCSL) to train a global multi-modal model on the entire data distribution of decentralized clients. We employ the contrastive loss that enables a more efficient self-supervised learning of decentralized modules. Comprehensive experiments are conducted on the VQA-v2 dataset based on five SOTA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Bidirectional Contrastive Split Learning for Visual Question Answering· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Head and Neck Surgical Oncology