Agentic Performance at the Edge: Insights from Benchmarking
Shiqiang Wang, Herbert Woisetschl\"ager

TL;DR
This paper empirically investigates how model size constraints affect agentic AI performance at the edge, emphasizing the importance of joint model and tool workflow design.
Contribution
It introduces a domain-conditioned evaluation methodology and provides practical guidance for model selection under resource constraints.
Findings
Edge-agent quality is not solely determined by parameter count.
Joint design of models and tools is crucial for robust deployment.
Pareto fronts in accuracy-latency space guide operational strategy.
Abstract
Agentic artificial intelligence (AI) is a natural fit for Internet of Things (IoT) and edge systems, but edge deployments are often constrained to models around 8 billion parameters or smaller. An important question is: How much agentic-task quality is lost when model size is constrained by memory, power, and latency budgets? To address this question, in this paper, we provide an initial empirical study considering edge-focused model scaling, general-purpose versus coder-oriented model effects, and tool-enabled execution under a fixed protocol. We introduce a domain-conditioned evaluation methodology, an implementation-grounded analysis of model-tool interactions, practical guidance for model selection under constraints, and an analysis of failure modes that reveals distinct semantic versus execution failure patterns across model families. Our core finding is that edge-agent quality is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
