HiNS: Hierarchical Negative Sampling for More Comprehensive Memory Retrieval Embedding Model

Motong Tian; Allen P. Wong; Mingjun Mao; Wangchunshu Zhou

arXiv:2601.14857·cs.CL·January 22, 2026

HiNS: Hierarchical Negative Sampling for More Comprehensive Memory Retrieval Embedding Model

Motong Tian, Allen P. Wong, Mingjun Mao, Wangchunshu Zhou

PDF

Open Access

TL;DR

This paper introduces HiNS, a hierarchical negative sampling framework that improves memory retrieval in language agents by modeling negative sample difficulty and distribution, leading to better performance in memory-intensive tasks.

Contribution

HiNS is a novel data construction method that explicitly models negative sample difficulty tiers and ratios, enhancing embedding models' ability to discriminate in memory retrieval.

Findings

01

Significant performance improvements on LoCoMo and PERSONAMEM datasets.

02

Memory retrieval F1/BLEU-1 gains of over 3% on MemoryOS.

03

Total score improvements of over 2.5% on Mem0.

Abstract

Memory-augmented language agents rely on embedding models for effective memory retrieval. However, existing training data construction overlooks a critical limitation: the hierarchical difficulty of negative samples and their natural distribution in human-agent interactions. In practice, some negatives are semantically close distractors while others are trivially irrelevant, and natural dialogue exhibits structured proportions of these types. Current approaches using synthetic or uniformly sampled negatives fail to reflect this diversity, limiting embedding models' ability to learn nuanced discrimination essential for robust memory retrieval. In this work, we propose a principled data construction framework HiNS that explicitly models negative sample difficulty tiers and incorporates empirically grounded negative ratios derived from conversational data, enabling the training of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning