Memori: A Persistent Memory Layer for Efficient, Context-Aware LLM Agents

Luiz C. Borro; Luiz A. B. Macarini; Gordon Tindall; Michael Montero; Adam B. Struck

arXiv:2603.19935·cs.LG·March 23, 2026

Memori: A Persistent Memory Layer for Efficient, Context-Aware LLM Agents

Luiz C. Borro, Luiz A. B. Macarini, Gordon Tindall, Michael Montero, Adam B. Struck

PDF

Open Access

TL;DR

Memori introduces a vendor-agnostic persistent memory layer for LLMs that uses structured semantic representations to improve context-awareness, reduce token costs, and enhance multi-session interactions.

Contribution

The paper presents Memori, a novel memory system that converts dialogue into structured data, enabling efficient, scalable, and cost-effective context management for LLM agents.

Findings

01

Achieves 81.95% accuracy on LoCoMo benchmark

02

Uses only 1,294 tokens per query, about 5% of full context

03

Reduces token usage by 67% compared to existing methods

Abstract

As large language models (LLMs) evolve into autonomous agents, persistent memory at the API layer is essential for enabling context-aware behavior across LLMs and multi-session interactions. Existing approaches force vendor lock-in and rely on injecting large volumes of raw conversation into prompts, leading to high token costs and degraded performance. We introduce Memori, an LLM-agnostic persistent memory layer that treats memory as a data structuring problem. Its Advanced Augmentation pipeline converts unstructured dialogue into compact semantic triples and conversation summaries, enabling precise retrieval and coherent reasoning. Evaluated on the LoCoMo benchmark, Memori achieves 81.95% accuracy, outperforming existing memory systems while using only 1,294 tokens per query (~5% of full context). This results in substantial cost reductions, including 67% fewer tokens than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques