Are LLMs Ready for TOON? Benchmarking Structural Correctness-Sustainability Trade-offs in Novel Structured Output Formats
Elio Masciari, Vincenzo Moscato, Enea Vincenzo Napolitano, Gian Marco Orlando, Marco Perillo, Diego Russo

TL;DR
This paper introduces a sustainability-aware evaluation framework for structured output formats from LLMs, highlighting trade-offs between correctness and environmental efficiency, and benchmarks the novel TOON format against traditional formats across various models.
Contribution
It proposes a new evaluation metric combining correctness and carbon emissions, and systematically benchmarks TOON against JSON, XML, and YAML, revealing insights on efficiency and correctness trade-offs.
Findings
TOON produces more compact outputs with lower emissions.
Increased model capacity improves correctness of TOON.
Environment-aware scoring can alter format rankings based on deployment needs.
Abstract
Large Language Models (LLMs) are increasingly required to generate structured, machine-readable outputs for downstream systems. While recent benchmarks have focused on evaluating the structural correctness of such outputs, the environmental impact of inference for different output formats has largely been overlooked. In this paper, we argue that structured output formats should be assessed not only in terms of correctness, but also with respect to their environmental efficiency. To this end, we introduce a sustainability-aware evaluation framework for structured generation that measures token usage, generation time, and estimated carbon emissions. Within this framework, we propose the Environment-Aware Generation Correctness Score (GCS_env), a unified metric that integrates structural correctness with carbon-aware efficiency. Using this framework, we systematically benchmark the novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGreen IT and Sustainability · Scientific Computing and Data Management · Software System Performance and Reliability
