Appearance-free Action Recognition: Zero-shot Generalization in Humans and a Two-Pathway Model

Prerana Kumar (1; 2); Martin A. Giese (1) ((1) Hertie Institute; University of Tuebingen; (2) IMPRS-IS)

arXiv:2604.16675·cs.CV·April 21, 2026

Appearance-free Action Recognition: Zero-shot Generalization in Humans and a Two-Pathway Model

Prerana Kumar (1, 2), Martin A. Giese (1) ((1) Hertie Institute, University of Tuebingen, (2) IMPRS-IS)

PDF

TL;DR

Humans and a two-pathway neural network can recognize actions in appearance-free videos, emphasizing the importance of motion cues for zero-shot generalization in action recognition.

Contribution

This study demonstrates zero-shot generalization of humans and a novel two-pathway model to appearance-free videos, highlighting motion's role in action recognition.

Findings

01

Humans recognize actions in appearance-free videos above chance levels.

02

The two-pathway model outperforms existing models and aligns closely with human performance.

03

Motion cues are critical for generalization to appearance-free videos.

Abstract

Action recognition is a fundamental ability for social species. Yet, its underlying computations are not well understood. Classical psychophysical studies using simplified stimuli have shown that humans can perceive body motion even under degradation of relevant shape cues. Recent work using real-world action videos and their appearance-free counterparts (that preserve motion but lack static shape cues) included explicit training of humans and models on the appearance-free videos. Whether humans and vision models generalize in a zero-shot manner to appearance-free transformations of real-world action videos is not yet known. To measure this generalization in humans, we conducted a laboratory-based psychophysics experiment. 22 participants were trained to recognize five action categories using naturalistic videos (UCF5 dataset), and tested zero-shot on two types of appearance-free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.