EmbeddingRWKV: State-Centric Retrieval with Reusable States
Haowen Hou, Jie Yang

TL;DR
EmbeddingRWKV introduces a unified, state-centric retrieval approach that leverages reusable states from an RWKV-based language model, significantly improving efficiency and maintaining high retrieval quality in RAG systems.
Contribution
The paper proposes a novel state-centric retrieval paradigm with a unified model, EmbeddingRWKV, enabling efficient retrieval and reranking by reusing compact states and reducing redundant computation.
Findings
Achieves 5.4x to 44.8x speedup in reranking.
Maintains 98.62% of full-model performance with only 25% of layers.
Demonstrates high-quality retrieval and reranking results.
Abstract
Current Retrieval-Augmented Generation (RAG) systems typically employ a traditional two-stage pipeline: an embedding model for initial retrieval followed by a reranker for refinement. However, this paradigm suffers from significant inefficiency due to the lack of shared information between stages, leading to substantial redundant computation. To address this limitation, we propose \textbf{State-Centric Retrieval}, a unified retrieval paradigm that utilizes "states" as a bridge to connect embedding models and rerankers. First, we perform state representation learning by fine-tuning an RWKV-based LLM, transforming it into \textbf{EmbeddingRWKV}, a unified model that serves as both an embedding model and a state backbone for extracting compact, reusable states. Building upon these reusable states, we further design a state-based reranker to fully leverage precomputed information. During…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Multimodal Machine Learning Applications
