HTM-EAR: Importance-Preserving Tiered Memory with Hybrid Routing under Saturation
Shubham Kumar Singh

TL;DR
HTM-EAR introduces a hierarchical memory system with importance-aware eviction and hybrid routing, effectively managing long-term facts in saturated environments while preserving critical information for long-running agents.
Contribution
The paper presents HTM-EAR, a novel tiered memory architecture combining importance-aware eviction and hybrid routing to improve fact retention under saturation.
Findings
Full system preserves active-query precision (MRR=1.000) under saturation.
Compared to LRU, HTM-EAR retains more essential facts and achieves higher MRR.
Code is publicly available for reproducibility.
Abstract
Memory constraints in long-running agents require structured management of accumulated facts while preserving essential information under bounded context limits. We introduce HTM-EAR, a hierarchical tiered memory substrate that integrates HNSW-based working memory (L1) with archival storage (L2), combining importance-aware eviction and hybrid routing. When L1 reaches capacity, items are evicted using a weighted score of importance and usage. Queries are first resolved in L1; if similarity or entity coverage is insufficient, retrieval falls back to L2, and candidates are re-ranked using a cross-encoder. We evaluate the system under sustained saturation (15,000 facts; L1 capacity 500; L2 capacity 5000) using synthetic streams across five random seeds and real BGL system logs. Ablation studies compare the full system against variants without cross-encoder re-ranking, without routing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Graph Theory and Algorithms · Software System Performance and Reliability
