Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG
Weixi Lin

TL;DR
Higress-RAG introduces a comprehensive enterprise retrieval-augmented generation framework that enhances retrieval precision, reduces hallucinations, and achieves low latency through layered architecture and innovative retrieval techniques.
Contribution
The paper presents Higress-RAG, a novel architecture integrating adaptive routing, hybrid retrieval, and CRAG to optimize enterprise RAG systems for accuracy and efficiency.
Findings
Improved retrieval precision for complex queries.
Reduced hallucination rates in generated content.
Achieved 50ms latency with semantic caching.
Abstract
The integration of Large Language Models (LLMs) into enterprise knowledge management systems has been catalyzed by the Retrieval-Augmented Generation (RAG) paradigm, which augments parametric memory with non-parametric external data. However, the transition from proof-of-concept to production-grade RAG systems is hindered by three persistent challenges: low retrieval precision for complex queries, high rates of hallucination in the generation phase, and unacceptable latency for real-time applications. This paper presents a comprehensive analysis of the Higress RAG MCP Server, a novel, enterprise-centric architecture designed to resolve these bottlenecks through a "Full-Link Optimization" strategy. Built upon the Model Context Protocol (MCP), the system introduces a layered architecture that orchestrates a sophisticated pipeline of Adaptive Routing, Semantic Caching, Hybrid Retrieval,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Semantic Web and Ontologies · Topic Modeling
