Grounding the Lexical Semantics of Verbs in Visual Perception using Force Dynamics and Event Logic
J. M. Siskind

TL;DR
This paper introduces a system that uses force dynamics and event logic to recognize and infer spatial-motion events in image sequences, improving robustness over previous motion profile-based methods.
Contribution
It presents a novel integration of force dynamics and event logic for lexical semantics, with an efficient representation and inference procedure for complex events.
Findings
Robust recognition of spatial-motion events in image sequences.
Efficient inference of compound events from primitive events.
Improved robustness over prior motion profile-based systems.
Abstract
This paper presents an implemented system for recognizing the occurrence of events described by simple spatial-motion verbs in short image sequences. The semantics of these verbs is specified with event-logic expressions that describe changes in the state of force-dynamic relations between the participants of the event. An efficient finite representation is introduced for the infinite sets of intervals that occur when describing liquid and semi-liquid events. Additionally, an efficient procedure using this representation is presented for inferring occurrences of compound events, described with event-logic expressions, from occurrences of primitive events. Using force dynamics and event logic to specify the lexical semantics of events allows the system to be more robust than prior systems based on motion profile.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
