Engineering flexible machine learning systems by traversing functionally-invariant paths
Guruprasad Raghavan, Bahey Tharwat, Surya Narayanan Hari, Dhruvil, Satani, Matt Thomson

TL;DR
This paper introduces a differential geometry framework called functionally invariant paths (FIP) that enables flexible, knowledge-preserving adaptation of neural networks across various tasks and models, including language and vision transformers.
Contribution
The paper proposes a novel geometric approach to neural network adaptation using geodesic paths in weight space, improving continual learning and sparsification without knowledge loss.
Findings
FIP achieves state-of-the-art performance on language and vision models.
The method effectively preserves prior knowledge during adaptation.
FIP requires modest computational resources.
Abstract
Transformers have emerged as the state of the art neural network architecture for natural language processing and computer vision. In the foundation model paradigm, large transformer models (BERT, GPT3/4, Bloom, ViT) are pre-trained on self-supervised tasks such as word or image masking, and then, adapted through fine-tuning for downstream user applications including instruction following and Question Answering. While many approaches have been developed for model fine-tuning including low-rank weight update strategies (eg. LoRA), underlying mathematical principles that enable network adaptation without knowledge loss remain poorly understood. Here, we introduce a differential geometry framework, functionally invariant paths (FIP), that provides flexible and continuous adaptation of neural networks for a range of machine learning goals and network sparsification objectives. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Computational Physics and Python Applications · Medical Imaging and Analysis
