LogRouter: Adaptive Two-Level LLM Routing for Log Question Answering in Big Data Systems

Mert Coskuner; Merve Zeybel; Melik Mert Dolan

arXiv:2605.18015·cs.LG·May 19, 2026

LogRouter: Adaptive Two-Level LLM Routing for Log Question Answering in Big Data Systems

Mert Coskuner, Merve Zeybel, Melik Mert Dolan

PDF

TL;DR

LogRouter is an adaptive, cost-aware routing system for log question answering in big data environments, combining multiple retrieval and generation methods to improve efficiency and accuracy.

Contribution

It introduces a novel two-level routing architecture that dynamically selects among multiple retrieval and generation paths for efficient log QA.

Findings

01

Achieves 88.4% routing accuracy across datasets.

02

Reduces latency by 55% compared to fixed large models.

03

Maintains high answer correctness and faithfulness.

Abstract

Production log analytics in self-hosted, resource-constrained environments requires natural-language access to massive log streams without the cost of routing every query through a large language model. We present LogRouter, an end-to-end log question-answering system deployed on TUBITAK BILGEM's national big data platform that combines a PySpark-based Drain3 ingestion pipeline, GPU-accelerated embeddings, and dual-index storage in Apache Druid and PostgreSQL with pgvector. A two-level cost-aware router dispatches each query along one of four execution paths: direct response, Druid keyword search, template lookup with SQL generation, and pgvector semantic retrieval, while a Level-2 router selects either a 14B-class or 32B-class generator for the semantic path. A dedicated coder LLM handles text-to-SQL generation. We evaluate the system on four LogHub datasets (Linux, Apache, Windows,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.