Toward Architecture-Aware Evaluation Metrics for LLM Agents

D\'ebora Souza; Patr\'icia Machado

arXiv:2601.19583·cs.SE·January 28, 2026

Toward Architecture-Aware Evaluation Metrics for LLM Agents

D\'ebora Souza, Patr\'icia Machado

PDF

Open Access

TL;DR

This paper introduces an architecture-aware evaluation framework for LLM agents, linking architectural components to observable behaviors and metrics, thereby improving the diagnostic and evaluative process.

Contribution

It presents a novel, lightweight approach that incorporates architectural insights into the evaluation of LLM agents, enhancing transparency and specificity.

Findings

01

Framework effectively links architecture to observable behaviors.

02

Application demonstrates improved evaluation clarity.

03

Enables targeted and transparent assessment of LLM agents.

Abstract

LLM-based agents are becoming central to software engineering tasks, yet evaluating them remains fragmented and largely model-centric. Existing studies overlook how architectural components, such as planners, memory, and tool routers, shape agent behavior, limiting diagnostic power. We propose a lightweight, architecture-informed approach that links agent components to their observable behaviors and to the metrics capable of evaluating them. Our method clarifies what to measure and why, and we illustrate its application through real world agents, enabling more targeted, transparent, and actionable evaluation of LLM-based agents.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Advanced Software Engineering Methodologies · Mobile Agent-Based Network Management