HYVE: Hybrid Views for LLM Context Engineering over Machine Data
Jian Tan, Fan Bu, Yuqing Gao, Dev Khanolkar, Jason Mackay, Boris Sobolev, Lei Jin, Li Zhang

TL;DR
HYVE is a framework that enhances LLMs' ability to process large, structured machine data by preprocessing inputs to reduce token usage and improve output quality across various real-world tasks.
Contribution
HYVE introduces a novel context engineering approach that surrounds LLM invocation with preprocessing and postprocessing, leveraging a request-scoped datastore for better handling of complex machine data.
Findings
Reduces token usage by 50-90% across benchmarks.
Improves chart-generation accuracy by up to 132%.
Reduces latency by up to 83%.
Abstract
Machine data is central to observability and diagnosis in modern computing systems, appearing in logs, metrics, telemetry traces, and configuration snapshots. When provided to large language models (LLMs), this data typically arrives as a mixture of natural language and structured payloads such as JSON or Python/AST literals. Yet LLMs remain brittle on such inputs, particularly when they are long, deeply nested, and dominated by repetitive structure. We present HYVE (HYbrid ViEw), a framework for LLM context engineering for inputs containing large machine-data payloads, inspired by database management principles. HYVE surrounds model invocation with coordinated preprocessing and postprocessing, centered on a request-scoped datastore augmented with schema information. During preprocessing, HYVE detects repetitive structure in raw inputs, materializes it in the datastore, transforms it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
