On the Dual-Use Dilemma in Physical Reasoning and Force
William Xie, Enora Rice, Nikolaus Correll

TL;DR
This paper investigates the challenges of implementing safeguards in vision-language models controlling robots, revealing that safety measures can hinder both harmful and beneficial forceful interactions, impacting robot capabilities and value alignment.
Contribution
It provides empirical case studies on safeguarding VLMs in forceful robotic tasks, highlighting the trade-offs between safety and functionality.
Findings
Safeguards reduce harmful robotic behaviors.
Safeguards also limit helpful contact-rich manipulation.
Value alignment may impede desirable robot capabilities.
Abstract
Humans learn how and when to apply forces in the world via a complex physiological and psychological learning process. Attempting to replicate this in vision-language models (VLMs) presents two challenges: VLMs can produce harmful behavior, which is particularly dangerous for VLM-controlled robots which interact with the world, but imposing behavioral safeguards can limit their functional and ethical extents. We conduct two case studies on safeguarding VLMs which generate forceful robotic motion, finding that safeguards reduce both harmful and helpful behavior involving contact-rich manipulation of human body parts. Then, we discuss the key implication of this result--that value alignment may impede desirable robot capabilities--for model evaluation and robot learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhilosophy and History of Science · Computability, Logic, AI Algorithms
