MemCtrl: Using MLLMs as Active Memory Controllers on Embodied Agents

Vishnu Sashank Dorbala; Dinesh Manocha

arXiv:2601.20831·cs.AI·January 29, 2026

MemCtrl: Using MLLMs as Active Memory Controllers on Embodied Agents

Vishnu Sashank Dorbala, Dinesh Manocha

PDF

Open Access

TL;DR

MemCtrl introduces a novel framework that employs trainable memory gating in Multimodal Large Language Models to enhance online memory management for embodied agents, significantly improving task performance.

Contribution

This work presents MemCtrl, a new approach using trainable memory gates in MLLMs for online memory pruning, tailored for embodied agents with strict memory and compute constraints.

Findings

01

16% average improvement on EmbodiedBench tasks

02

Over 20% improvement on specific instruction subsets

03

Qualitative analysis shows better handling of complex instructions

Abstract

Foundation models rely on in-context learning for personalized decision making. The limited size of this context window necessitates memory compression and retrieval systems like RAG. These systems however often treat memory as large offline storage spaces, which is unfavorable for embodied agents that are expected to operate under strict memory and compute constraints, online. In this work, we propose MemCtrl, a novel framework that uses Multimodal Large Language Models (MLLMs) for pruning memory online. MemCtrl augments MLLMs with a trainable memory head \mu that acts as a gate to determine which observations or reflections to retain, update, or discard during exploration. We evaluate with training two types of \mu, 1) via an offline expert, and 2) via online RL, and observe significant improvement in overall embodied task completion ability on \mu-augmented MLLMs. In particular, on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Machine Learning in Healthcare