Multimodal Pretrained Models for Verifiable Sequential Decision-Making:   Planning, Grounding, and Perception

Yunhao Yang; Cyrus Neary; Ufuk Topcu

arXiv:2308.05295·cs.AI·June 19, 2024

Multimodal Pretrained Models for Verifiable Sequential Decision-Making: Planning, Grounding, and Perception

Yunhao Yang, Cyrus Neary, Ufuk Topcu

PDF

Open Access

TL;DR

This paper introduces an algorithm that leverages multimodal pretrained models to construct, verify, and ground automaton-based controllers for sequential decision-making, providing formal guarantees and handling perceptual uncertainties in real-world tasks.

Contribution

It develops a novel method to integrate pretrained models into decision-making controllers with formal verification and grounding capabilities in visual environments.

Findings

01

Successfully constructs automaton-based controllers from pretrained models.

02

Provides probabilistic guarantees on controller correctness under perceptual uncertainties.

03

Demonstrates effectiveness on real-world daily life and robot manipulation tasks.

Abstract

Recently developed pretrained models can encode rich world knowledge expressed in multiple modalities, such as text and images. However, the outputs of these models cannot be integrated into algorithms to solve sequential decision-making tasks. We develop an algorithm that utilizes the knowledge from pretrained models to construct and verify controllers for sequential decision-making tasks, and to ground these controllers to task environments through visual observations with formal guarantees. In particular, the algorithm queries a pretrained model with a user-provided, text-based task description and uses the model's output to construct an automaton-based controller that encodes the model's task-relevant knowledge. It allows formal verification of whether the knowledge encoded in the controller is consistent with other independently available knowledge, which may include abstract…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Semantic Web and Ontologies · AI-based Problem Solving and Planning