LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers

Jingze Zhu; Yongliang Wu; Wenbo Zhu; Jiawang Cao; Yanqiang Zheng; Jiawei Chen; Xu Yang; Bernt Schiele; Jonas Fischer; Xinting Hu

arXiv:2507.04404·cs.AI·October 6, 2025

LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers

Jingze Zhu, Yongliang Wu, Wenbo Zhu, Jiawang Cao, Yanqiang Zheng, Jiawei Chen, Xu Yang, Bernt Schiele, Jonas Fischer, Xinting Hu

PDF

Open Access 3 Reviews

TL;DR

LayerCake introduces a token-aware contrastive decoding method that aligns token types with specific transformer layers to enhance factual accuracy in large language models without additional training.

Contribution

It presents a novel layer-aware contrastive decoding approach that jointly considers token types and layer dynamics to improve factuality in LLM outputs.

Findings

01

Consistently improves factual accuracy across multiple LLMs.

02

Effectively suppresses attention to certain token types at specific layers.

03

No additional training or model modifications required.

Abstract

Large language models (LLMs) excel at natural language understanding and generation but remain vulnerable to factual errors, limiting their reliability in knowledge-intensive tasks. While decoding-time strategies provide a promising efficient solution without training, existing methods typically treat token-level and layer-level signals in isolation, overlooking the joint dynamics between them. In this work, we introduce a token-aware, layer-localized contrastive decoding method that aligns specific token types with their most influential transformer layers to improve factual generation. Through empirical attention analysis, we identify two key patterns: punctuation tokens receive dominant attention in early layers, while conceptual tokens govern semantic reasoning in intermediate layers. By selectively suppressing attention to these token types at their respective depths, we achieve…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

- Experiment results on models like LLaMA, Mistral, and Qwen seems to improve factual accuracy on several benchmarks.

Weaknesses

- The paper introduced so many hyperparams: α、tₕₐ、tₕ_b、β、layer ranges in the method, however, it's unclear how the authors find the hyperparams. The only clear thing is: > We set β = 0.1 throughout the paper. Otherwise, the authors mentioned that > thresholds tₕₐ, tₕ_b, and α are determined empirically but there is no explanation or details for it. What do you mean by "determined empirically"? It's possible that the author is adjusting the hyperparams based on each individual test set perf

Reviewer 02Rating 6Confidence 4

Strengths

Constructing a contrastive signal by purposefully inducing erroneous predictions through targeted interventions is an elegant idea. The intervention design, which selectively suppresses attention to specific token types at their most influential layers, creates a meaningful contrastive distribution that exposes how factual reasoning emerges within the model. The link between token category (e.g. structural vs. conceptual) and layer range (early vs. mid vs. late) seems well-motivated, and the res

Weaknesses

Most evaluations focus on short-form, question-answering style tasks, which raises questions about the generality of the approach. It remains unclear whether the approach would perform equally well for more open-ended forms of text generation such as long-form summarization, creative writing, or code synthesis. These tasks involve richer discourse structures, longer contexts, and a more complex interplay of coherence and factuality than typical QA settings. The specific intervention strategy: em

Reviewer 03Rating 4Confidence 3

Strengths

* The general approach makes sense, that it is possible to identify more precise sources of "problematic" LLM behaviors, and use those to perform contrastive decoding that is more informed and targeted towards these areas (rather than a generic "weak model" vs. "expert model" scenario). * The specific method of using attention interventions on specific token categories is novel as far as I'm aware. * The results are presented over a wide array of benchmarks for different types of tasks, and show

Weaknesses

1. The method relies on quite a lot of parameters and heuristics - I was not entirely convinced by the motivation for those, and at the same time the paper is somewhat unclear regarding the methodology for choosing them in practice. Specifically, there is $th_a$ and $th_b$, and $\alpha$; the choice of layers representing the "early" and "middle" stages of processing; a separate attention modification logic for punctuation and conceptual tokens; and the decision to perform contrast with each of t

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling