Decoupling Knowledge and Task Subspaces for Composable Parametric Retrieval Augmented Generation
Weihang Su, Hanwen Zhang, Qingyao Ai, Yiqun Liu

TL;DR
This paper introduces Orthogonal Subspace Decomposition (OSD), a method to separate task and document knowledge in adapters, improving the stability and robustness of parametric retrieval-augmented generation when merging multiple document adapters.
Contribution
It proposes a novel adapter training setup that orthogonalizes task and document knowledge, enhancing adapter composition reliability in PRAG systems.
Findings
Orthogonalization improves adapter merging stability.
OSD enhances robustness across multiple knowledge-intensive tasks.
Experiments show better performance with orthogonal task and document adapters.
Abstract
Parametric Retrieval-Augmented Generation (PRAG) encodes external documents into lightweight parameter modules that can be retrieved and merged at inference time, offering a promising alternative to in-context retrieval augmentation. Despite its potential, many PRAG implementations train document adapters with task-supervised objectives, which may cause each adapter to encode both document-specific facts and reusable task-solving behavior. This entanglement may make adapter composition less reliable: when multiple adapters are merged at inference time, their overlapping task behaviors can accumulate together with document-specific updates, potentially making the merged adapter less stable and less focused on the intended document knowledge. To examine this issue, we explore Orthogonal Subspace Decomposition (OSD), an adapter-training setup that separates reusable task behavior from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
