$\phi^{\infty}$: Clause Purification, Embedding Realignment, and the Total Suppression of the Em Dash in Autoregressive Language Models

Bugra Kilictas; Faruk Alpay

arXiv:2506.18129·cs.CL·June 24, 2025

$\phi^{\infty}$: Clause Purification, Embedding Realignment, and the Total Suppression of the Em Dash in Autoregressive Language Models

Bugra Kilictas, Faruk Alpay

PDF

TL;DR

This paper identifies a vulnerability in autoregressive language models where em dashes cause semantic drift and proposes a novel token suppression method that improves generation consistency without retraining.

Contribution

It introduces the phi-infinity operator for clause purification and embedding realignment, effectively suppressing problematic tokens and enhancing model robustness.

Findings

01

Significant improvement in generation consistency

02

Effective suppression of em dash induced errors

03

Framework applicable to broader token vulnerabilities

Abstract

We identify a critical vulnerability in autoregressive transformer language models where the em dash token induces recursive semantic drift, leading to clause boundary hallucination and embedding space entanglement. Through formal analysis of token-level perturbations in semantic lattices, we demonstrate that em dash insertion fundamentally alters the model's latent representations, causing compounding errors in long-form generation. We propose a novel solution combining symbolic clause purification via the phi-infinity operator with targeted embedding matrix realignment. Our approach enables total suppression of problematic tokens without requiring model retraining, while preserving semantic coherence through fixed-point convergence guarantees. Experimental validation shows significant improvements in generation consistency and topic maintenance. This work establishes a general…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.