SKILLS: Structured Knowledge Injection for LLM-Driven Telecommunications Operations
Ivo Brett

TL;DR
This paper presents SKILLS, a benchmark framework for evaluating LLMs in telecom workflows, demonstrating that structured domain guidance significantly improves model performance across diverse scenarios.
Contribution
Introduction of SKILLS, a comprehensive benchmark with real API simulations and evaluation metrics, to assess LLM capabilities in telecom operations with structured domain knowledge.
Findings
Structured guidance improves LLM performance by up to 18.9 percentage points.
MiniMax M2.5 achieves 81.1% success rate with domain skills.
Models show consistent skill lift across multiple scenarios.
Abstract
As telecommunications operators accelerate adoption of AI-enabled automation, a practical question remains unresolved: can general-purpose large language model (LLM) agents reliably execute telecom operations workflows through real API interfaces, or do they require structured domain guidance? We introduce SKILLS (Structured Knowledge Injection for LLM-driven Service Lifecycle operations), a benchmark framework comprising 37 telecom operations scenarios spanning 8 TM Forum Open API domains (TMF620, TMF621, TMF622, TMF628, TMF629, TMF637, TMF639, TMF724). Each scenario is grounded in live mock API servers with seeded production-representative data, MCP tool interfaces, and deterministic evaluation rubrics combining response content checks, tool-call verification, and database state assertions. We evaluate open-weight models under two conditions: baseline (generic agent with tool access…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Natural Language Processing Techniques · Topic Modeling
