Trustformer: A Trusted Federated Transformer

Ali Abbasi Tadi; Dima Alhadidi; Luis Rueda

arXiv:2501.11706·cs.LG·January 22, 2025

Trustformer: A Trusted Federated Transformer

Ali Abbasi Tadi, Dima Alhadidi, Luis Rueda

PDF

Open Access

TL;DR

This paper presents a privacy-preserving federated learning method for Transformers that reduces communication costs by transmitting layer centroids instead of full weights, maintaining high utility and enhancing security.

Contribution

It introduces a novel federated learning approach that uses clustering to simulate global Transformer models locally, reducing communication overhead and improving privacy.

Findings

01

Achieves comparable utility to state-of-the-art methods.

02

Significantly reduces communication costs.

03

Enhances security with Intel SGX.

Abstract

Transformers, a cornerstone of deep-learning architectures for sequential data, have achieved state-of-the-art results in tasks like Natural Language Processing (NLP). Models such as BERT and GPT-3 exemplify their success and have driven the rise of large language models (LLMs). However, a critical challenge persists: safeguarding the privacy of data used in LLM training. Privacy-preserving techniques like Federated Learning (FL) offer potential solutions, but practical limitations hinder their effectiveness for Transformer training. Two primary issues are (I) the risk of sensitive information leakage due to aggregation methods like FedAvg or FedSGD, and (II) the high communication overhead caused by the large size of Transformer models. This paper introduces a novel FL method that reduces communication overhead while maintaining competitive utility. Our approach avoids sharing full…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhysical Unclonable Functions (PUFs) and Hardware Security · Low-power high-performance VLSI design · Smart Grid Security and Resilience

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Warmup With Linear Decay · Absolute Position Encodings · WordPiece · Linear Layer · Weight Decay · Multi-Head Attention