Structured Context Engineering for File-Native Agentic Systems: Evaluating Schema Accuracy, Format Effectiveness, and Multi-File Navigation at Scale

Damon McMillan

arXiv:2602.05447·cs.CL·February 13, 2026

Structured Context Engineering for File-Native Agentic Systems: Evaluating Schema Accuracy, Format Effectiveness, and Multi-File Navigation at Scale

Damon McMillan

PDF

Open Access

TL;DR

This study systematically evaluates how context structuring, format, and architecture affect large language model agents' accuracy in file-native systems, revealing model capability as the dominant factor and providing guidance for practical deployment.

Contribution

It offers the first comprehensive empirical analysis of context engineering for structured data in LLM agents, covering multiple models, formats, and schemas at scale.

Findings

01

File-based context retrieval improves accuracy for frontier models.

02

Format does not significantly impact overall accuracy, but affects individual models.

03

Model capability is the primary determinant of accuracy, surpassing format or architecture effects.

Abstract

Large Language Model agents increasingly operate external systems through programmatic interfaces, yet practitioners lack empirical guidance on how to structure the context these agents consume. Using SQL generation as a proxy for programmatic agent operations, we present a systematic study of context engineering for structured data, comprising 9,649 experiments across 11 models, 4 formats (YAML, Markdown, JSON, Token-Oriented Object Notation [TOON]), and schemas ranging from 10 to 10,000 tables. Our findings challenge common assumptions. First, architecture choice is model-dependent: file-based context retrieval improves accuracy for frontier-tier models (Claude, GPT, Gemini; +2.7%, p=0.029) but shows mixed results for open source models (aggregate -7.7%, p<0.001), with deficits varying substantially by model. Second, format does not significantly affect aggregate accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Model-Driven Software Engineering Techniques · Speech and dialogue systems