Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments
Yifan Xu, Vineet Kamat, Carol Menassa

TL;DR
This paper introduces SwPC, a conformal prediction-based framework for VLM-driven place recognition in indoor environments, which quantifies uncertainty to improve accuracy and reduce human intervention in assistive robotics.
Contribution
SwPC is a novel, lightweight framework that measures and aligns uncertainty in VLM-based place recognition without requiring model fine-tuning, enhancing reliability in assistive robotics.
Findings
SwPC significantly improves success rates in place recognition.
SwPC reduces the need for human assistance in complex indoor environments.
SwPC can be integrated with any VLM without additional training.
Abstract
In assistive robotics serving people with disabilities (PWD), accurate place recognition in built environments is crucial to ensure that robots navigate and interact safely within diverse indoor spaces. Language interfaces, particularly those powered by Large Language Models (LLM) and Vision Language Models (VLM), hold significant promise in this context, as they can interpret visual scenes and correlate them with semantic information. However, such interfaces are also known for their hallucinated predictions. In addition, language instructions provided by humans can also be ambiguous and lack precise details about specific locations, objects, or actions, exacerbating the hallucination issue. In this work, we introduce Seeing with Partial Certainty (SwPC) - a framework designed to measure and align uncertainty in VLM-based place recognition, enabling the model to recognize when it lacks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · Video Surveillance and Tracking Methods
MethodsALIGN
