Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence

\.Ilker I\c{s}{\i}k; Ramazan Gokberk Cinbis; Ebru Aydin Gol

arXiv:2410.17161·cs.CL·June 19, 2025

Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence

\.Ilker I\c{s}{\i}k, Ramazan Gokberk Cinbis, Ebru Aydin Gol

PDF

Open Access 1 Models 1 Datasets 1 Video

TL;DR

This paper introduces a novel token embedding method that enables language models to recognize interchangeable tokens and alpha-equivalence, improving generalization and reasoning in formal logic tasks.

Contribution

It formalizes the problem of token interchangeability, proposes alpha-covariance as a robustness metric, and introduces a dual-part embedding strategy to enhance model flexibility.

Findings

01

Improved generalization to unseen tokens in logic tasks

02

Enhanced recognition of alpha-equivalence

03

Favorable inductive bias for formal reasoning

Abstract

Language models lack the notion of interchangeable tokens: symbols that are semantically equivalent yet distinct, such as bound variables in formal logic. This limitation prevents generalization to larger vocabularies and hinders the model's ability to recognize alpha-equivalence, where renaming bound variables preserves meaning. We formalize this machine learning problem and introduce alpha-covariance, a metric for evaluating robustness to such transformations. To tackle this task, we propose a dual-part token embedding strategy: a shared component ensures semantic consistency, while a randomized component maintains token distinguishability. Compared to a baseline that relies on alpha-renaming for data augmentation, our approach demonstrates improved generalization to unseen tokens in linear temporal logic solving, propositional logic assignment prediction, and copying with an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
necrashter/interchangeable-token-embeddings
model

Datasets

necrashter/interchangeable-token-embeddings-datasets
dataset· 46 dl
46 dl

Videos

Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · semigroups and automata theory

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Position-Wise Feed-Forward Layer · Dense Connections · Softmax · Multi-Head Attention · Adam · Dropout