Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards
Omar Erak, Nouf Alabbasi, Omar Alhussein, Ismail Lotfi, Amr Hussein,, Sami Muhaidat, Merouane Debbah

TL;DR
This paper introduces a fine-tuned retrieval-augmented generation system based on Phi-2 SLM, enhancing telecom standards processing by adaptive chunking, re-ranking, context expansion, and efficient fine-tuning, outperforming larger models like GPT-4.
Contribution
It presents a novel RAG system for telecom standards using Phi-2 SLM with techniques like semantic chunking, re-ranking, SelfExtend, and LoRA for improved performance and efficiency.
Findings
Outperforms existing QA methods in telecom domain
Exceeds GPT-4 performance despite smaller size
Demonstrates effective context expansion and fine-tuning techniques
Abstract
Recent studies show that large language models (LLMs) struggle with technical standards in telecommunications. We propose a fine-tuned retrieval-augmented generation (RAG) system based on the Phi-2 small language model (SLM) to serve as an oracle for communication networks. Our developed system leverages forward-looking semantic chunking to adaptively determine parsing breakpoints based on embedding similarity, enabling effective processing of diverse document formats. To handle the challenge of multiple similar contexts in technical standards, we employ a re-ranking algorithm to prioritize the most relevant retrieved chunks. Recognizing the limitations of Phi-2's small context window, we implement a recent technique, namely SelfExtend, to expand the context window during inference, which not only boosts the performance but also can accommodate a wider range of user queries and design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Caching and Content Delivery · Algorithms and Data Compression
MethodsLinear Layer · Residual Connection · Multi-Head Attention · Adam · Layer Normalization · Attention Is All You Need · Position-Wise Feed-Forward Layer · Dense Connections · Byte Pair Encoding · Absolute Position Encodings
