TL;DR
TeamLLM introduces a human-like multi-LLM collaboration framework with distinct roles, improving performance on multi-step contextualized tasks, evaluated using a new benchmark with comprehensive assessments.
Contribution
It proposes a novel team-oriented collaboration framework for LLMs, explicitly emulating human team roles, and introduces a benchmark for multi-step contextualized tasks.
Findings
TeamLLM significantly enhances LLM performance on the CGPST benchmark.
The CGPST benchmark features contextual grounding, procedural structure, and multi-dimensional evaluation.
Evaluation of ten LLMs shows improved results with the TeamLLM framework.
Abstract
Recently, multi-Large Language Model (LLM) frameworks have been proposed to solve contextualized tasks. However, these frameworks do not explicitly emulate human team role division, which may lead to a single perspective, thereby weakening performance on multi-step contextualized tasks. To address this issue, we propose TeamLLM, a human-like Team-Oriented Multi-LLM Collaboration Framework. TeamLLM adopts four team roles with distinct division and employs a three-phase multi-LLM collaboration for multi-step contextualized tasks. To evaluate the effectiveness of TeamLLM on multi-step contextualized tasks, we propose Contextually-Grounded and Procedurally-Structured tasks (CGPST) and construct the CGPST benchmark. This benchmark has four core features: contextual grounding, procedural structure, process-oriented evaluation and multi-dimensional assessment. We evaluate ten popular LLMs on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
