Task Vector Bases: A Unified and Scalable Framework for Compressed Task Arithmetic
Siqi Zeng, Yifei He, Meitong Liu, Weiqiu You, Yifan Hao, Yao-Hung Hubert Tsai, Makoto Yamada, Han Zhao

TL;DR
This paper introduces Task Vector Bases, a framework that compresses multiple task vectors into fewer basis vectors, enabling scalable task arithmetic with theoretical guarantees and improved empirical performance.
Contribution
It proposes a novel basis compression method for task vectors that maintains arithmetic capabilities and enhances scalability in transfer learning.
Findings
Outperforms heuristic basis methods in experiments.
Reduces storage and computation while preserving task arithmetic performance.
Sometimes surpasses full task vector collections in downstream tasks.
Abstract
Task arithmetic, representing downstream tasks through linear operations on task vectors, has emerged as a simple yet powerful paradigm for transferring knowledge across diverse settings. However, maintaining a large collection of task vectors introduces scalability challenges in both storage and computation. We propose Task Vector Bases, a framework compressing task vectors into basis vectors while preserving the functionality of task arithmetic. By representing each task vector as a structured linear combination of basis atoms, our approach supports standard operations such as addition, negation, as well as more advanced arithmetic ones. The framework is orthogonal to other efficiency-oriented improvements in task arithmetic and can be used in combination with them. We provide theoretical analysis showing that basis compression retains addition generalization guarantees…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. Clear diagnosis of PCA’s limitations. The geometric argument for why PCA fails to preserve task-arithmetic semantics is insightful and well-motivated, and it naturally motivates the proposed approach. 2. Solid results at higher compression. When compressing a larger collection of task vectors, the method outperforms simple baselines; these observations align with the authors’ theoretical analysis. 3. Covers both addition and subtraction. Evaluating both additive and subtractive task arithmeti
1. Questionable practical motivation for very large T. A major stated goal is reducing storage overhead for many task vectors. Yet in practice, the quality of simple task-vector merging typically degrades rapidly as T grows; much prior work evaluates in the ~8–20 range. If real deployments rarely maintain very large libraries of task vectors due to performance collapse, the storage-savings motivation is less compelling without concrete use cases. 2. Limited baselines. The main experiments compar
- The positioning of the paper in literature is clear, offering a framework that tackles a novel problem. - The method is compelling, as it is solidly grounded in theory, allowing for using classical results from spectral analysis. - Considering the constraints of the problem, the experimental validation is extensive (also, it is very useful that the authors compared with simple PCA) and the results are promising. - The paper is well-written, the figures/plots help the narrative and understandin
Overall, the paper is solid. However, in light of recent advances in Task Arithmetic, the following points require some attention: **W1.** [1,2,3,4] prove and support the idea that linearization around the pre-trained parameters is the key enabler of proper Task Arithmetic. The intuition is that, when task vectors implement functions that are linear w.r.t. the weights of the model, the composed model will act as a linear combination of orthogonal functions (thus, allowing for minimal interfere
- Practical relevance of the problem formulation: Storing a full set of task vectors for each task in large-scale models is both memory-intensive and computationally expensive during merging. The desire to perform task arithmetic across multiple tasks with a minimal set of vectors reflects a realistic and pressing need in practical deployment settings. The authors' focus on this challenge is timely and well-motivated. - Limitations of PCA-based compression and proposed remedy: The paper highlig
- Why autoencoders?: While the authors rightly note that PCA-based compression does not yield basis vectors that can be expressed as non-negative linear combinations of task vectors, the paper does not clarify why alternative dimensionality reduction techniques with inherent non-negativity constraints were not explored. For example, Nonnegative Matrix Factorization (NMF) is a well-established method that decomposes a non-negative matrix into a product of non-negative basis and coefficient matric
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Reinforcement Learning in Robotics
