Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs
Arya Shah, Vaibhav Tripathi

TL;DR
This study evaluates how different neural network architectures and training methods align with feline and human visual representations, revealing that self-supervised Vision Transformers most closely mimic cross-species visual processing.
Contribution
It introduces a unified benchmark for cross-species representational alignment and compares various models, highlighting the effectiveness of self-supervised ViTs in bridging feline and human visual features.
Findings
Self-supervised ViTs show the highest alignment with feline and human visual representations.
Alignment peaks at early layers, indicating early-stage features are crucial.
Architectural biases influence cross-species representational similarity.
Abstract
Cats and humans differ in ocular anatomy. Most notably, Felis Catus (domestic cats) have vertically elongated pupils linked to ambush predation; yet, how such specializations manifest in downstream visual representations remains incompletely understood. We present a unified, frozen-encoder benchmark that quantifies feline-human cross-species representational alignment in the wild, across convolutional networks, supervised Vision Transformers, windowed transformers, and self-supervised ViTs (DINO), using layer-wise Centered Kernel Alignment (linear and RBF) and Representational Similarity Analysis, with additional distributional and stability tests reported in the paper. Across models, DINO ViT-B/16 attains the most substantial alignment (mean CKA-RBF , mean CKA-linear , mean RSA ), peaking at early blocks, indicating that token-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Visual perception and processing mechanisms · Ophthalmology and Visual Impairment Studies
