TSNBench: Benchmarking LLM Proficiency in Time-Sensitive Networking

Rubi Debnath; Daniel Bujosa Mateu; Luxi Zhao; Silviu S. Craciunas; Paul Pop; Sebastian Steinhorst

arXiv:2605.09481·cs.NI·May 12, 2026

TSNBench: Benchmarking LLM Proficiency in Time-Sensitive Networking

Rubi Debnath, Daniel Bujosa Mateu, Luxi Zhao, Silviu S. Craciunas, Paul Pop, Sebastian Steinhorst

PDF

TL;DR

TSNBench is a new benchmark evaluating large language models' proficiency in safety-critical Time-Sensitive Networking, revealing significant gaps in open-ended delay computation tasks despite high MCQ accuracy.

Contribution

The paper introduces TSNBench, the first comprehensive benchmark for LLMs in TSN, including expert-validated MCQs and open-ended WCD tasks with verified ground truths.

Findings

01

LLMs achieve 67-95% accuracy on MCQs but perform poorly on WCD tasks.

02

GPT-5 has a 36.2% MAPE on CBS WCD prediction, the best among tested models.

03

Most models' WCD prediction errors exceed 80%, risking real-time safety violations.

Abstract

We present TSNBench, the first benchmark for evaluating large language model (LLM) proficiency in Time-Sensitive Networking (TSN), a suite of IEEE 802.1 standards for deterministic communication with bounded latency in safety-critical domains such as autonomous vehicles, aviation, defense, and industrial automation. While LLMs have been extensively evaluated on general knowledge tasks, their capabilities in safety-critical networking domains remain largely unexplored. TSNBench comprises 939 expert-validated multiple-choice questions (MCQs) covering diverse TSN mechanisms, along with 100 open-ended Worst-Case Delay (WCD) computation tasks for Credit-Based Shaper (CBS) and Cyclic Queuing and Forwarding (CQF) across varying network topologies and traffic conditions. MCQ answers are validated by domain experts, and open-ended ground truth WCD values are computed using a verified Network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.