Latent State Estimation Helps UI Agents to Reason
William E Bishop, Alice Li, Christopher Rawles, Oriana Riva

TL;DR
This paper explores how large language models can estimate latent states in real-world UI tasks, improving reasoning and task completion by explicitly modeling unobservable environment aspects.
Contribution
It demonstrates that prompting LLMs can form point estimates of latent states, leading to significantly better reasoning and task success in autonomous UI agents.
Findings
LLMs can accurately infer latent state aspects with over 76% accuracy.
Explicit latent state estimation improves task completion rates by up to 1.6 times.
Zero-shot prompting effectively enables LLMs to reason about non-deterministic environments.
Abstract
A common problem for agents operating in real-world environments is that the response of an environment to their actions may be non-deterministic and observed through noise. This renders environmental state and progress towards completing a task latent. Despite recent impressive demonstrations of LLM's reasoning abilities on various benchmarks, whether LLMs can build estimates of latent state and leverage them for reasoning has not been explicitly studied. We investigate this problem in the real-world domain of autonomous UI agents. We establish that appropriately prompting LLMs in a zero-shot manner can be formally understood as forming point estimates of latent state in a textual space. In the context of autonomous UI agents we then show that LLMs used in this manner are more than accurate at inferring various aspects of latent state, such as performed (vs. commanded) actions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
