Sensi: Learn One Thing at a Time -- Curriculum-Based Test-Time Learning for LLM Game Agents
Mohsen Arjmandi

TL;DR
Sensi introduces a structured, curriculum-based test-time learning framework for LLM game agents, significantly improving sample efficiency and diagnosing perceptual grounding issues in complex environments.
Contribution
The paper presents Sensi, a novel LLM agent architecture with curriculum learning and a database control plane, enhancing test-time learning efficiency in game environments.
Findings
Sensi v1 solves 2 game levels with basic architecture.
Sensi v2 achieves 50-94x greater sample efficiency.
Identifies perceptual hallucination cascade as a key failure mode.
Abstract
Large language model (LLM) agents deployed in unknown environments must learn task structure at test time, but current approaches require thousands of interactions to form useful hypotheses. We present Sensi, an LLM agent architecture for the ARC-AGI-3 game-playing challenge that introduces structured test-time learning through three mechanisms: (1) a two-player architecture separating perception from action, (2) a curriculum-based learning system managed by an external state machine, and (3) a database-as-control-plane that makes the agents context window programmatically steerable. We further introduce an LLM-as-judge component with dynamically generated evaluation rubrics to determine when the agent has learned enough about one topic to advance to the next. We report results across two iterations: Sensi v1 solves 2 game levels using the two-player architecture alone, while Sensi v2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education
