LAYA: Layer-wise Attention Aggregation for Interpretable Depth-Aware Neural Networks
Gennaro Vessio

TL;DR
LAYA introduces a novel attention-based output head for neural networks that dynamically combines internal layer representations, enhancing interpretability and maintaining or improving predictive performance across vision and language tasks.
Contribution
The paper proposes LAYA, a layer-wise attention aggregation mechanism that improves interpretability and performance by leveraging internal representations in neural networks.
Findings
LAYA achieves up to 1% accuracy improvement on benchmarks.
Provides explicit layer attribution scores for interpretability.
Works across vision and language models.
Abstract
Deep neural networks typically rely on the representation produced by their final hidden layer to make predictions, implicitly assuming that this single vector fully captures the semantics encoded across all preceding transformations. However, intermediate layers contain rich and complementary information -- ranging from low-level patterns to high-level abstractions -- that is often discarded when the decision head depends solely on the last representation. This paper revisits the role of the output layer and introduces LAYA (Layer-wise Attention Aggregator), a novel output head that dynamically aggregates internal representations through attention. Instead of projecting only the deepest embedding, LAYA learns input-conditioned attention weights over layer-wise features, yielding an interpretable and architecture-agnostic mechanism for synthesizing predictions. Experiments on vision and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
