TL;DR
This paper introduces an unsupervised algorithm that infers intentionality in agent behaviors from 3D kinematics by incorporating basic physical knowledge, achieving performance comparable to supervised methods without training data.
Contribution
The paper presents a novel unsupervised approach that uses physical principles to recognize intentional actions in visual scenes, a capability previously limited to supervised learning.
Findings
Algorithm accurately distinguishes intentional from unintentional actions.
Performance comparable to supervised baselines on diverse datasets.
Works without training data, demonstrating generalization.
Abstract
Computer vision algorithms performance are near or superior to humans in the visual problems including object recognition (especially those of fine-grained categories), segmentation, and 3D object reconstruction from 2D views. Humans are, however, capable of higher-level image analyses. A clear example, involving theory of mind, is our ability to determine whether a perceived behavior or action was performed intentionally or not. In this paper, we derive an algorithm that can infer whether the behavior of an agent in a scene is intentional or unintentional based on its 3D kinematics, using the knowledge of self-propelled motion, Newtonian motion and their relationship. We show how the addition of this basic knowledge leads to a simple, unsupervised algorithm. To test the derived algorithm, we constructed three dedicated datasets from abstract geometric animation to realistic videos of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
