GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR
Jiaying Zhang, Lei Shi, Jiguo Li, Jun Xu, Jiuchong Gao, Jinghua Hao, Renqing He

TL;DR
GeoRA is a novel geometry-aware low-rank adaptation method designed for reinforcement learning with verifiable rewards, preserving pre-trained structures and improving efficiency and performance across large models.
Contribution
GeoRA exploits the anisotropic structure of RLVR updates, using SVD to initialize low-rank adapters and maintain residual components as structural anchors.
Findings
GeoRA outperforms strong low-rank baselines in RLVR tasks.
GeoRA shows better generalization and less forgetting on out-of-domain tasks.
Experiments on models from 1.5B to 32B parameters demonstrate consistent improvements.
Abstract
Reinforcement Learning with Verifiable Rewards (RLVR) is a key paradigm for improving large-scale reasoning models. Unlike supervised fine-tuning (SFT), RLVR exhibits distinct optimization dynamics and is sensitive to the preservation of pre-trained geometric structures. However, existing parameter-efficient methods face key limitations in this regime. Low-rank adaptation methods, such as PiSSA, are primarily designed for Supervised Fine-Tuning (SFT) and do not account for the distinct optimization dynamics and geometric structures of RLVR. Conversely, directly fine-tuning the unstructured sparse parameter subspace favored by RLVR encounters efficiency bottlenecks on modern hardware. To address these challenges, we propose GeoRA (Geometry-Aware Low-Rank Adaptation), a low-rank adaptation method tailored for RLVR. Specifically, GeoRA exploits the anisotropic and compressible structure of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
