TL;DR
This paper presents a neurosymbolic framework that combines deep learning and logical reasoning for interpretable skeleton-based human action recognition, achieving competitive accuracy and explicit logical explanations.
Contribution
It introduces a novel neurosymbolic approach that grounds motion concepts in logical predicates and aligns them with language descriptions for interpretable action recognition.
Findings
Achieves competitive recognition accuracy on NTU RGB+D datasets.
Provides explicit logical explanations for recognized actions.
Aligns learned concepts with language descriptions for semantic grounding.
Abstract
Skeleton-based human activity recognition has achieved strong empirical performance, yet most existing models remain black boxes and difficult to interpret. In this work, we introduce a neurosymbolic formulation of skeleton-based HAR that reframes action recognition as concept-driven first-order logical reasoning over motion primitives. Our framework bridges representation learning and symbolic inference by grounding first-order logic predicates in learnable spatial and temporal motion concepts. Specifically, we employ a standard spatio-temporal skeleton encoder to extract latent motion representations, which are then mapped to interpretable concept predicates via a spatio-temporal concept decoder that explicitly separates pose-centric and dynamics-centric abstractions. These concept predicates are composed through differentiable first-order logic layers, enabling the model to learn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
