Neural Organ Transplantation (NOT): Checkpoint-Based Modular Adaptation for Transformer Models
Ahmad Al-Zuraiqi

TL;DR
Neural Organ Transplantation (NOT) introduces a modular framework for transformer domain adaptation by transplanting independently trained layer subsets, outperforming traditional fine-tuning methods and enabling privacy-preserving transfer.
Contribution
The paper presents a novel checkpoint-based modular adaptation method for transformer models, allowing layer transplantation without retraining on original data.
Findings
Transplantation of pre-trained layers improves perplexity significantly.
Early insertion positions yield better adaptation results.
Method enables efficient, privacy-preserving domain transfer for large decoder-only transformers.
Abstract
We introduce Neural Organ Transplantation (NOT), a modular adaptation framework that enables trained transformer layers to function as reusable transferable checkpoints for domain adaptation. Unlike conventional fine-tuning approaches that tightly couple trained parameters to specific model instances and training data, NOT extracts contiguous layer subsets ("donor organs") from pre-trained models, trains them independently on domain-specific data, and saves them as standalone checkpoint files that can be transplanted into compatible recipient models without access to the original training data. Through experiments on three decoder-only transformer architectures spanning 124M to 20B parameters (GPT-2, TinyLlama, and GPT-OSS), we demonstrate that donor transplantation substantially outperforms existing adaptation methods, achieving an order-of-magnitude improvement in perplexity over LoRA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications
