Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval
Hanqi Zhang, Chong Chen, Lang Mei, Qi Liu, Jiaxin Mao

TL;DR
The paper introduces Mamba Retriever, a non-Transformer dense retrieval model that offers comparable effectiveness to Transformer models while significantly improving inference speed, especially for long-text retrieval tasks.
Contribution
It demonstrates that Mamba architecture can serve as an effective and efficient encoder for dense retrieval, outperforming Transformer-based models in speed and handling longer texts.
Findings
Mamba Retriever achieves comparable or better effectiveness than Transformer models.
It extends to longer texts with maintained effectiveness after fine-tuning.
It has superior inference speed for long-text retrieval.
Abstract
In the information retrieval (IR) area, dense retrieval (DR) models use deep learning techniques to encode queries and passages into embedding space to compute their semantic relations. It is important for DR models to balance both efficiency and effectiveness. Pre-trained language models (PLMs), especially Transformer-based PLMs, have been proven to be effective encoders of DR models. However, the self-attention component in Transformer-based PLM results in a computational complexity that grows quadratically with sequence length, and thus exhibits a slow inference speed for long-text retrieval. Some recently proposed non-Transformer PLMs, especially the Mamba architecture PLMs, have demonstrated not only comparable effectiveness to Transformer-based PLMs on generative language tasks but also better efficiency due to linear time scaling in sequence length. This paper implements the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
