MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration
Lin Xu, Zhiyuan Hu, Daquan Zhou, Hongyu Ren, Zhen Dong, Kurt Keutzer,, See Kiong Ng, Jiashi Feng

TL;DR
This paper introduces a new benchmark framework for evaluating large language models in multi-agent social and cognitive scenarios, highlighting their reasoning, collaboration, and rationality capabilities.
Contribution
It presents a novel competition-based benchmark with probabilistic graphic modeling to assess LLMs' multi-agent social and cognitive abilities, including a comprehensive evaluation of seven models.
Findings
Significant capability gap between strongest and weakest models.
PGM enhancement improves model abilities by an average of 37%.
Evaluation across diverse social deduction and game-theory scenarios.
Abstract
Large Language Models (LLMs) have significantly advanced natural language processing, demonstrating exceptional reasoning, tool usage, and memory capabilities. As their applications expand into multi-agent environments, there arises a need for a comprehensive evaluation framework that captures LLMs' reasoning, planning, collaboration, and other social abilities. This work introduces a novel competition-based benchmark framework specifically designed to assess LLMs within multi-agent settings, providing quantitative metrics to evaluate their judgment, reasoning, deception, self-awareness, cooperation, coordination, and rationality. We utilize two social deduction games alongside three game-theory scenarios to create diverse environments. Our frame is fortified with the probabilistic graphic modeling (PGM) method, enhancing the LLMs' capabilities in navigating complex social and cognitive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Softmax · Position-Wise Feed-Forward Layer · Label Smoothing · Dense Connections · Absolute Position Encodings · Residual Connection
