Multi-Task Distributed Learning using Vision Transformer with Random Patch Permutation
Sangjoon Park, Jong Chul Ye

TL;DR
This paper introduces a novel multi-task distributed learning method using Vision Transformer with random patch permutation, improving collaboration, privacy, and efficiency in medical imaging applications.
Contribution
It proposes replacing CNN heads with random patch permutation in ViT, enhancing multi-task learning and privacy without increasing communication overhead.
Findings
Significant improvement in multi-task collaboration performance
Enhanced communication efficiency in distributed learning
Better privacy preservation in medical imaging tasks
Abstract
The widespread application of artificial intelligence in health research is currently hampered by limitations in data availability. Distributed learning methods such as federated learning (FL) and shared learning (SL) are introduced to solve this problem as well as data management and ownership issues with their different strengths and weaknesses. The recent proposal of federated split task-agnostic (FeSTA) learning tries to reconcile the distinct merits of FL and SL by enabling the multi-task collaboration between participants through Vision Transformer (ViT) architecture, but they suffer from higher communication overhead. To address this, here we present a multi-task distributed learning using ViT with random patch permutation. Instead of using a CNN based head as in FeSTA, p-FeSTA adopts a randomly permuting simple patch embedder, improving the multi-task learning performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cerebrospinal fluid and hydrocephalus · Stochastic Gradient Optimization Techniques
MethodsAttention Is All You Need · Linear Layer · Dropout · Absolute Position Encodings · Label Smoothing · Softmax · Adam · Residual Connection · Byte Pair Encoding · Position-Wise Feed-Forward Layer
