LLMs Meet Isolation Kernel: Lightweight, Learning-free Binary Embeddings for Fast Retrieval
Zhibo Zhang, Yang Xu, Kai Ming Ting, Cam-Tu Nguyen

TL;DR
This paper introduces IKE, a learning-free binary embedding method for LLMs that significantly reduces retrieval time and memory usage while maintaining accuracy, suitable for fast text retrieval tasks.
Contribution
IKE is a novel, learning-free binary embedding technique for LLMs that improves retrieval efficiency and memory footprint without sacrificing accuracy.
Findings
IKE achieves up to 16.7x faster retrieval
IKE reduces memory usage by 16x
IKE maintains comparable accuracy to original embeddings
Abstract
Large language models (LLMs) have recently enabled remarkable progress in text representation. However, their embeddings are typically high-dimensional, leading to substantial storage and retrieval overhead. Although recent approaches such as Matryoshka Representation Learning (MRL) and Contrastive Sparse Representation (CSR) alleviate these issues to some extent, they still suffer from retrieval accuracy degradation. This paper proposes Isolation Kernel Embedding or IKE, a learning-free method that transforms an LLM embedding into a binary embedding using Isolation Kernel (IK). Lightweight and based on binary encoding, IKE offers a low memory footprint and fast bitwise computation, lowering retrieval latency. Experiments on multiple text retrieval datasets demonstrate that IKE offers up to 16.7x faster retrieval and 16x lower memory usage than the original LLM embeddings, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
