TL;DR
BEE-RAG introduces a novel entropy engineering framework that stabilizes retrieval-augmented generation performance across varying context lengths by maintaining entropy invariance, enhancing adaptability and efficiency.
Contribution
The paper proposes the BEE-RAG framework, which employs balanced entropy principles to improve RAG systems' stability and adaptability to different context lengths.
Findings
BEE-RAG achieves improved performance across multiple RAG tasks.
The framework effectively stabilizes attention dynamics regardless of context length.
Experimental results demonstrate enhanced robustness and efficiency.
Abstract
With the rapid advancement of large language models (LLMs), retrieval-augmented generation (RAG) has emerged as a critical approach to supplement the inherent knowledge limitations of LLMs. However, due to the typically large volume of retrieved information, RAG tends to operate with long context lengths. From the perspective of entropy engineering, we identify unconstrained entropy growth and attention dilution due to long retrieval context as significant factors affecting RAG performance. In this paper, we propose the balanced entropy-engineered RAG (BEE-RAG) framework, which improves the adaptability of RAG systems to varying context lengths through the principle of entropy invariance. By leveraging balanced context entropy to reformulate attention dynamics, BEE-RAG separates attention sensitivity from context length, ensuring a stable entropy level. Building upon this, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
