Topology-Enhanced Alignment for Large Language Models: Trajectory Topology Loss and Topological Preference Optimization

Yurui Pan; Ke Xu; Bo Peng

arXiv:2605.07172·cs.CL·May 11, 2026

Topology-Enhanced Alignment for Large Language Models: Trajectory Topology Loss and Topological Preference Optimization

Yurui Pan, Ke Xu, Bo Peng

PDF

TL;DR

This paper introduces a topology-aware framework for aligning large language models by regularizing semantic trajectories with persistent homology, improving alignment quality over traditional methods.

Contribution

It proposes Trajectory Topology Loss and Topological Preference Optimization, novel methods leveraging topological features to enhance LLM alignment during fine-tuning.

Findings

01

Topology-enhanced objectives outperform non-topological baselines on preference metrics.

02

Persistent homology captures meaningful semantic bridges in model trajectories.

03

Methods maintain or improve toxicity levels while enhancing alignment quality.

Abstract

Alignment of large language models (LLMs) via SFT and RLHF/DPO typically ignores the global geometry of the representation space, relying instead on local token likelihoods or scalar scores. We view generation as tracing a semantic trajectory in hidden space and propose a topology-enhanced alignment framework that regularizes these trajectories using 0-dimensional persistent homology. First, for SFT, we introduce Trajectory Topology Loss (TTL). Treating prompt and gold-answer embeddings as a mixed point cloud, we use a 0D persistent homology algorithm to extract "prompt-answer bridges." TTL aligns the model's actual update direction with these topological bridges rather than arbitrary directions. Second, for DPO, we propose Topological Preference Optimization (TPO). TPO constructs topic-specific semantic preference vectors and aligns the improvement direction between rejected and chosen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.