Affordance Representation and Recognition for Autonomous Agents
Habtom Kahsay Gidey, Niklas Huber, Alexander Lenz, and Alois Knoll

TL;DR
This paper presents a pattern language for world modeling in autonomous agents, enabling efficient, adaptive, and scalable understanding of web environments through compact representations and dynamic affordance recognition.
Contribution
It introduces two architectural patterns—DOM Transduction and Hypermedia Affordances Recognition—for improved world modeling from structured data.
Findings
Efficiently distills raw DOM into compact, task-relevant representations.
Enables dynamic discovery and integration of web service capabilities.
Supports scalable and adaptive web automation.
Abstract
The autonomy of software agents is fundamentally dependent on their ability to construct an actionable internal world model from the structured data that defines their digital environment, such as the Document Object Model (DOM) of web pages and the semantic descriptions of web services. However, constructing this world model from raw structured data presents two critical challenges: the verbosity of raw HTML makes it computationally intractable for direct use by foundation models, while the static nature of hardcoded API integrations prevents agents from adapting to evolving services. This paper introduces a pattern language for world modeling from structured data, presenting two complementary architectural patterns. The DOM Transduction Pattern addresses the challenge of web page complexity by distilling} a verbose, raw DOM into a compact, task-relevant representation or world model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
