Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context   Support: For 3GPP Standards

Omar Erak; Nouf Alabbasi; Omar Alhussein; Ismail Lotfi; Amr Hussein,; Sami Muhaidat; Merouane Debbah

arXiv:2408.11775·cs.CL·January 17, 2025·3 cites

Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards

Omar Erak, Nouf Alabbasi, Omar Alhussein, Ismail Lotfi, Amr Hussein,, Sami Muhaidat, Merouane Debbah

PDF

Open Access 1 Repo

TL;DR

This paper introduces a fine-tuned retrieval-augmented generation system based on Phi-2 SLM, enhancing telecom standards processing by adaptive chunking, re-ranking, context expansion, and efficient fine-tuning, outperforming larger models like GPT-4.

Contribution

It presents a novel RAG system for telecom standards using Phi-2 SLM with techniques like semantic chunking, re-ranking, SelfExtend, and LoRA for improved performance and efficiency.

Findings

01

Outperforms existing QA methods in telecom domain

02

Exceeds GPT-4 performance despite smaller size

03

Demonstrates effective context expansion and fine-tuning techniques

Abstract

Recent studies show that large language models (LLMs) struggle with technical standards in telecommunications. We propose a fine-tuned retrieval-augmented generation (RAG) system based on the Phi-2 small language model (SLM) to serve as an oracle for communication networks. Our developed system leverages forward-looking semantic chunking to adaptively determine parsing breakpoints based on embedding similarity, enabling effective processing of diverse document formats. To handle the challenge of multiple similar contexts in technical standards, we employ a re-ranking algorithm to prioritize the most relevant retrieved chunks. Recognizing the limitations of Phi-2's small context window, we implement a recent technique, namely SelfExtend, to expand the context window during inference, which not only boosts the performance but also can accommodate a wider range of user queries and design…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Nouf-Alabbasi/oKUmura_AI_Telecom_challenge
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Caching and Content Delivery · Algorithms and Data Compression

MethodsLinear Layer · Residual Connection · Multi-Head Attention · Adam · Layer Normalization · Attention Is All You Need · Position-Wise Feed-Forward Layer · Dense Connections · Byte Pair Encoding · Absolute Position Encodings