ORAN-Bench-13K: An Open Source Benchmark for Assessing LLMs in Open   Radio Access Networks

Pranshav Gajjar; Vijay K. Shah

arXiv:2407.06245·cs.NI·July 16, 2024

ORAN-Bench-13K: An Open Source Benchmark for Assessing LLMs in Open Radio Access Networks

Pranshav Gajjar, Vijay K. Shah

PDF

Open Access 1 Repo 10 Models 5 Datasets

TL;DR

This paper introduces ORAN-Bench-13K, a comprehensive benchmark with nearly 14,000 questions to evaluate LLMs in O-RAN tasks, and proposes ORANSight, a RAG-based pipeline that significantly improves performance.

Contribution

It presents the first large-scale benchmark for LLMs in O-RAN and introduces ORANSight, a retrieval-augmented approach that enhances LLM performance in this domain.

Findings

01

Current LLMs perform poorly on O-RAN tasks.

02

ORANSight outperforms other models with macro accuracy 0.784.

03

Incorporating retrieval improves LLM performance by over 21%.

Abstract

Large Language Models (LLMs) can revolutionize how we deploy and operate Open Radio Access Networks (O-RAN) by enhancing network analytics, anomaly detection, and code generation and significantly increasing the efficiency and reliability of a plethora of O-RAN tasks. In this paper, we present ORAN-Bench-13K, the first comprehensive benchmark designed to evaluate the performance of Large Language Models (LLMs) within the context of O-RAN. Our benchmark consists of 13,952 meticulously curated multiple-choice questions generated from 116 O-RAN specification documents. We leverage a novel three-stage LLM framework, and the questions are categorized into three distinct difficulties to cover a wide spectrum of ORAN-related knowledge. We thoroughly evaluate the performance of several state-of-the-art LLMs, including Gemini, Chat-GPT, and Mistral. Additionally, we propose ORANSight, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

prnshv/oran-bench-13k
noneOfficial

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLibrary Science and Information Systems · Digital Rights Management and Security · Wikis in Education and Collaboration