Ehrenfeucht-Haussler Rank and Chain of Thought

Pablo Barcel\'o; Alexander Kozachinskiy; Tomasz Steifer

arXiv:2501.12997·cs.LG·August 12, 2025

Ehrenfeucht-Haussler Rank and Chain of Thought

Pablo Barcel\'o, Alexander Kozachinskiy, Tomasz Steifer

PDF

Open Access

TL;DR

This paper introduces a new characterization of Boolean function rank based on Transformer Chain of Thought steps, establishing bounds and implications for PAC learning and multi-head transformer models.

Contribution

It provides a novel Transformer-based rank characterization, tight bounds on CoT steps for specific functions, and analyzes PAC-learnability of functions with bounded multi-head rank.

Findings

01

Rank corresponds to minimum CoT steps in single-layer Transformers.

02

Exact CoT steps needed for function composition and position-finding problems.

03

Analysis of PAC-learnability for functions with bounded multi-head rank.

Abstract

The notion of \emph{rank} of a Boolean function has been a cornerstone in PAC learning theory, enabling quasipolynomial-time learning algorithms for polynomial-size decision trees. We present a novel characterization of rank, grounded in the well-known Transformer architecture. We show that the rank of a function $f$ corresponds to the minimum number of \emph{Chain of Thought} (CoT) steps required by a single-layer Transformer with hard attention to compute $f$ . Based on this characterization we establish tight bounds on the number of CoT steps required for specific problems, showing that $\ell$-fold function composition necessitates exactly $\ell$ CoT steps. Furthermore, we analyze the problem of identifying the position of the $k$-th occurrence of 1 in a Boolean sequence, proving that it requires $k$ CoT steps. Finally, we introduce the notion of the multi-head rank that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhilosophy, Science, and History

MethodsAttention Is All You Need · Adam · Softmax · Absolute Position Encodings · Residual Connection · Dropout · Byte Pair Encoding · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer