Bisecting K-Means in RAG for Enhancing Question-Answering Tasks   Performance in Telecommunications

Pedro Sousa; Cl\'audio Klautau Mello; Frank B. Morte; and Luis F.; Solis Navarro

arXiv:2502.20188·cs.IR·February 28, 2025

Bisecting K-Means in RAG for Enhancing Question-Answering Tasks Performance in Telecommunications

Pedro Sousa, Cl\'audio Klautau Mello, Frank B. Morte, and Luis F., Solis Navarro

PDF

Open Access

TL;DR

This paper introduces a Retrieval-Augmented Generation framework using Bisecting K-Means clustering for telecom question-answering, improving relevance and efficiency with small language models on 3GPP datasets.

Contribution

It presents a novel RAG framework tailored for telecom, utilizing Bisecting K-Means to enhance information retrieval and reduce computational costs.

Findings

01

Achieved 66.12% accuracy on phi-2 models

02

Achieved 72.13% accuracy on phi-3 models

03

Reduced training time for small language models

Abstract

Question-answering tasks in the telecom domain are still reasonably unexplored in the literature, primarily due to the field's rapid changes and evolving standards. This work presents a novel Retrieval-Augmented Generation framework explicitly designed for the telecommunication domain, focusing on datasets composed of 3GPP documents. The framework introduces the use of the Bisecting K-Means clustering technique to organize the embedding vectors by contents, facilitating more efficient information retrieval. By leveraging this clustering technique, the system pre-selects a subset of clusters that are most similar to the user's query, enhancing the relevance of the retrieved information. Aiming for models with lower computational cost for inference, the framework was tested using Small Language Models, demonstrating improved performance with an accuracy of 66.12% on phi-2 and 72.13% on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Information Retrieval and Search Behavior · Expert finding and Q&A systems

Methodsk-Means Clustering