Mela: Test-Time Memory Consolidation based on Transformation Hypothesis

Lungchuan Chen

arXiv:2605.10537·cs.CL·May 12, 2026

Mela: Test-Time Memory Consolidation based on Transformation Hypothesis

Lungchuan Chen

PDF

1 Repo

TL;DR

Mela introduces a test-time memory consolidation approach using a hierarchical memory module inspired by neuroscience, enhancing language models' ability to handle longer contexts and improve performance.

Contribution

The paper proposes the Hierarchical Memory Module (HMM) and integrates it into Transformers to create Mela, enabling online memory consolidation at test time with multi-granularity representations.

Findings

01

Mela outperforms Transformer baselines across all model sizes.

02

Mela maintains performance on longer contexts beyond training length.

03

Ablation studies validate each component's effectiveness.

Abstract

Memory consolidation, the process by which transient experiences are transformed into stable, structured representations, is a foundational organizing principle in the human brain, yet it remains largely unexplored as a design principle for modern sequence models. In this work, we leverage established neuroscientific theories of memory consolidation and cross-frequency coupling to propose the Hierarchical Memory Module (HMM), a neural memory architecture composed of two functionally distinct sub-modules that operate at different update frequencies. Inspired by the transformation hypothesis, the low-frequency sub-module produces high-level representations that capture abstract, gist-level knowledge, while the high-frequency sub-module produces fine-grained representations that preserve richer episodic detail. The final memory output is dynamically reconstructed as a context-dependent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

musubi-ai/Mela
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.