DSADF: Thinking Fast and Slow for Decision Making

Zhihao Dou; Dongfei Cui; Jun Yan; Weida Wang; Benteng Chen; Haoming Wang; Zeke Xie; Shufei Zhang

arXiv:2505.08189·cs.LG·August 26, 2025

DSADF: Thinking Fast and Slow for Decision Making

Zhihao Dou, Dongfei Cui, Jun Yan, Weida Wang, Benteng Chen, Haoming Wang, Zeke Xie, Shufei Zhang

PDF

TL;DR

This paper introduces DSADF, a dual-system framework combining fast RL-based decisions with slow, deep reasoning via vision language models, improving adaptability and decision quality in complex environments.

Contribution

The paper proposes a novel dual-system decision framework inspired by Kahneman's theory, integrating RL and VLMs for enhanced adaptive decision-making.

Findings

01

Significant improvement in decision accuracy in unseen tasks

02

Effective balance of fast and slow reasoning processes

03

Demonstrated success in video game environments

Abstract

Although Reinforcement Learning (RL) agents are effective in well-defined environments, they often struggle to generalize their learned policies to dynamic settings due to their reliance on trial-and-error interactions. Recent work has explored applying Large Language Models (LLMs) or Vision Language Models (VLMs) to boost the generalization of RL agents through policy optimization guidance or prior knowledge. However, these approaches often lack seamless coordination between the RL agent and the foundation model, leading to unreasonable decision-making in unfamiliar environments and efficiency bottlenecks. Making full use of the inferential capabilities of foundation models and the rapid response capabilities of RL agents and enhancing the interaction between the two to form a dual system is still a lingering scientific question. To address this problem, we draw inspiration from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.