SMAC-R1: The Emergence of Intelligence in Decision-Making Tasks

Yue Deng; Weiyu Ma; Yuxin Fan; Ruyi Song; Yin Zhang; Haifeng Zhang,; Jian Zhao

arXiv:2410.16024·cs.AI·March 7, 2025

SMAC-R1: The Emergence of Intelligence in Decision-Making Tasks

Yue Deng, Weiyu Ma, Yuxin Fan, Ruyi Song, Yin Zhang, Haifeng Zhang,, Jian Zhao

PDF

Open Access 1 Repo

TL;DR

This paper introduces SMAC-R1, a novel approach that uses large language models to generate interpretable decision trees for multi-agent reinforcement learning in StarCraft environments, achieving high transferability and minimal exploration.

Contribution

The paper presents a new method combining LLMs and decision trees for MARL, with a pipeline that includes self-reflection and fine-tuning, improving interpretability and transferability of policies.

Findings

01

High-quality, interpretable decision trees generated

02

Strong transferability to new environments demonstrated

03

Minimal environmental exploration required

Abstract

StarCraft Multi-Agent Challenge (SMAC) has been one of the most commonly used experimental environments in multi-agent reinforcement learning (MARL), where the specific task is to control a set number of allied units to defeat enemy forces. Traditional MARL algorithms often require interacting with the environment for millions of steps to train a parametric model, of which the resulting policies are typically non-interpretable with weak transferability. In this paper, we introduce SMAC-R1 which is based on the Qwen2.5-7B-Base LLM distilled from DeepSeek-Coder-v2.5-236B. Similar to online reinforcement learning after behavior cloning in offline learning process, in our pipeline, agents leverage the DeepSeek LLM to generate decision tree code by providing task descriptions, and the agents are further self-reflected using feedback from the rewards provided by the environment. Based on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

devindeng94/llm-smac
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFuzzy Logic and Control Systems · Machine Learning and Data Classification · Data Mining Algorithms and Applications

MethodsSparse Evolutionary Training