Nautilus Compass: Black-box Persona Drift Detection for Production LLM Agents

Chunxiao Wang

arXiv:2605.09863·cs.CR·May 12, 2026

Nautilus Compass: Black-box Persona Drift Detection for Production LLM Agents

Chunxiao Wang

PDF

1 Repo

TL;DR

Nautilus Compass is a black-box, prompt-text-based persona drift detection system for production LLM agents, operating without model weights and providing tamper-evident audit logs.

Contribution

It introduces a novel black-box persona drift detection method that works solely at the prompt-text layer, applicable to closed API LLMs, with a comprehensive system implementation.

Findings

01

Achieves ROC AUC 0.83 for drift detection on real session traces.

02

Outperforms baseline retrieval pipelines on LongMemEval-S and EverMemBench-Dynamic.

03

Reproduction cost is approximately $3.50, significantly cheaper than some existing systems.

Abstract

Production LLM coding agents drift over long sessions: they forget user-specified constraints, slip into mistakes the user already flagged, and confabulate prior agreements. White-box approaches such as persona vectors require model weights and so cannot be applied to closed APIs (Claude, GPT-4) that most users actually interact with. We present Nautilus Compass, a black-box persona drift detector and agent memory layer for production coding agents. The method operates entirely at the prompt-text layer: cosine similarity between user prompts and behavioral anchor texts, aggregated by a weighted top-k mean using BGE-m3 embeddings. Compass is, to our knowledge, the only public agent memory layer (among Mem0, Letta, Cognee, Zep, MemOS, smrti verified May 2026) that does not call an LLM at index time to extract facts or build a graph; raw conversation text is embedded directly. The system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chunxiaoxx/nautilus-compass
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.