Fine-Tuning Language Models Using Formal Methods Feedback

Yunhao Yang; Neel P. Bhatt; Tyler Ingebrand; William Ward; Steven; Carr; Zhangyang Wang; Ufuk Topcu

arXiv:2310.18239·cs.AI·April 2, 2024·1 cites

Fine-Tuning Language Models Using Formal Methods Feedback

Yunhao Yang, Neel P. Bhatt, Tyler Ingebrand, William Ward, Steven, Carr, Zhangyang Wang, Ufuk Topcu

PDF

Open Access

TL;DR

This paper introduces an automated fine-tuning method for pre-trained language models in autonomous systems, using formal methods feedback to improve domain-specific control policies without human input.

Contribution

It presents a novel automated approach that synthesizes and verifies controllers from language models guided by formal specifications, reducing reliance on human feedback.

Findings

01

Controller compliance with specifications increased from 60% to 90%.

02

Method demonstrated effectiveness in autonomous driving tasks.

03

Automated fine-tuning reduces costs compared to human feedback methods.

Abstract

Although pre-trained language models encode generic knowledge beneficial for planning and control, they may fail to generate appropriate control policies for domain-specific tasks. Existing fine-tuning methods use human feedback to address this limitation, however, sourcing human feedback is labor intensive and costly. We present a fully automated approach to fine-tune pre-trained language models for applications in autonomous systems, bridging the gap between generic knowledge and domain-specific requirements while reducing cost. The method synthesizes automaton-based controllers from pre-trained models guided by natural language task descriptions. These controllers are verifiable against independently provided specifications within a world model, which can be abstract or obtained from a high-fidelity simulator. Controllers with high compliance with the desired specifications receive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms