CADMAS-CTX: Contextual Capability Calibration for Multi-Agent Delegation
Chuhan Qiao

TL;DR
This paper introduces CADMAS-CTX, a framework that dynamically calibrates agent capabilities based on context, improving multi-agent delegation accuracy and robustness over static skill profiles.
Contribution
It proposes a hierarchical, context-aware capability calibration method using Bayesian posteriors, with formal regret bounds and empirical validation on benchmarks.
Findings
CADMAS-CTX outperforms static baselines on GAIA and SWE-bench.
Uncertainty penalty enhances robustness to context tagging noise.
Contextual calibration reduces delegation errors and improves teamwork.
Abstract
We revisit multi-agent delegation under a stronger and more realistic assumption: an agent's capability is not fixed at the skill level, but depends on task context. A coding agent may excel at short standalone edits yet fail on long-horizon debugging; a planner may perform well on shallow tasks yet degrade on chained dependencies. Static skill-level capability profiles therefore average over heterogeneous situations and can induce systematic misdelegation. We propose CADMAS-CTX, a framework for contextual capability calibration. For each agent, skill, and coarse context bucket, CADMAS-CTX maintains a Beta posterior that captures stable experience in that part of the task space. Delegation is then made by a risk-aware score that combines the posterior mean with an uncertainty penalty, so that agents delegate only when a peer appears better and that assessment is sufficiently well…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
