Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP Technical Specifications
Thaina Saraiva, Marco Sousa, Pedro Vieira, Ant\'onio Rodrigues

TL;DR
This paper introduces Telco-DPR, a hybrid dataset for evaluating retrieval models on 3GPP telecom documents, and demonstrates that hierarchical retrieval models and RAG techniques improve QA performance in this domain.
Contribution
The paper presents a new hybrid dataset, Telco-DPR, combining text and tables, and evaluates retrieval models and QA systems tailored for telecom technical documents.
Findings
DHR outperforms traditional retrieval models with 86.2% Top-10 accuracy.
RAG with GPT-4 improves answer accuracy by 14%.
Hybrid dataset enables effective evaluation of retrieval and QA models.
Abstract
This paper proposes a Question-Answering (QA) system for the telecom domain using 3rd Generation Partnership Project (3GPP) technical documents. Alongside, a hybrid dataset, Telco-DPR, which consists of a curated 3GPP corpus in a hybrid format, combining text and tables, is presented. Additionally, the dataset includes a set of synthetic question/answer pairs designed to evaluate the retrieval performance of QA systems on this type of data. The retrieval models, including the sparse model, Best Matching 25 (BM25), as well as dense models, such as Dense Passage Retriever (DPR) and Dense Hierarchical Retrieval (DHR), are evaluated and compared using top-K accuracy and Mean Reciprocal Rank (MRR). The results show that DHR, a retriever model utilising hierarchical passage selection through fine-tuning at both the document and passage levels, outperforms traditional methods in retrieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIPv6, Mobility, Handover, Networks, Security · Web Data Mining and Analysis · Network Packet Processing and Optimization
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Attention Dropout · Weight Decay · Dense Connections · Label Smoothing · Byte Pair Encoding · BART · Layer Normalization · Residual Connection
