Symbolic Rule Extraction from Attention-Guided Sparse Representations in Vision Transformers

Parth Padalkar; Gopal Gupta

arXiv:2505.06745·cs.CV·January 14, 2026

Symbolic Rule Extraction from Attention-Guided Sparse Representations in Vision Transformers

Parth Padalkar, Gopal Gupta

PDF

TL;DR

This paper introduces a novel method for extracting symbolic, logic-based rules from Vision Transformers by using a sparse concept layer and rule-generation algorithms, improving interpretability and accuracy.

Contribution

It presents the first framework for symbolic rule extraction from ViTs using a sparse concept layer and logic programming, enhancing interpretability and model performance.

Findings

01

Achieved 5.14% higher accuracy than standard ViT

02

Generated concise, meaningful logic rules from ViT representations

03

Enabled direct symbolic reasoning within the ViT architecture

Abstract

Recent neuro-symbolic approaches have successfully extracted symbolic rule-sets from CNN-based models to enhance interpretability. However, applying similar techniques to Vision Transformers (ViTs) remains challenging due to their lack of modular concept detectors and reliance on global self-attention mechanisms. We propose a framework for symbolic rule extraction from ViTs by introducing a sparse concept layer inspired by Sparse Autoencoders (SAEs). This linear layer operates on attention-weighted patch representations and learns a disentangled, binarized representation in which individual neurons activate for high-level visual concepts. To encourage interpretability, we apply a combination of L1 sparsity, entropy minimization, and supervised contrastive loss. These binarized concept activations are used as input to the FOLD-SE-M algorithm, which generates a rule-set in the form of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer