DDL2PropBank Agent: Benchmarking Multi-Agent Frameworks' Developer Experience Through a Novel Relational Schema Mapping Task
Shafiuddin Rehan Ahmed, Wei Wei

TL;DR
This paper introduces DDL2PropBank, a benchmark for evaluating multi-agent frameworks' developer experience through a relational schema mapping task, assessing code complexity and AI-assistability across frameworks.
Contribution
It presents a novel benchmark task and a comprehensive evaluation methodology for comparing multi-agent frameworks' developer experience.
Findings
Agno outperforms other frameworks with 83% pass@1.
Structural alignment scores correlate with runtime success in single-pattern frameworks.
Pydantic AI and Agno require the least implementation effort.
Abstract
Multi-agent frameworks promise to simplify LLM-driven software development, yet there is no principled way to evaluate their developer experience in a controlled setting. We introduce DDL2PropBank, a novel benchmark task that maps relational database schemas to PropBank rolesets, requiring autonomous retrieval of candidate frames and fine-grained linguistic reasoning over table names, columns, and relations. Using the Agent-as-a-Tool pattern, we implement identical agent logic across 10 frameworks and evaluate along two dimensions: (i) code complexity via static analysis, and (ii) AI-assistability -- the extent to which LLMs can autonomously generate correct, framework-specific code. Our results reveal a threefold complexity spectrum, with Pydantic AI and Agno requiring the least implementation overhead. For AI-assistability, structural alignment scores reliably proxy runtime success…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Multi-Agent Systems and Negotiation · Model-Driven Software Engineering Techniques
