Reasoning about Actions over Visual and Linguistic Modalities: A Survey

Shailaja Keyur Sampat; Maitreya Patel; Subhasish Das; Yezhou Yang and; Chitta Baral

arXiv:2207.07568·cs.CL·July 18, 2022·6 cites

Reasoning about Actions over Visual and Linguistic Modalities: A Survey

Shailaja Keyur Sampat, Maitreya Patel, Subhasish Das, Yezhou Yang and, Chitta Baral

PDF

Open Access

TL;DR

This survey reviews recent progress in reasoning about actions across visual and linguistic modalities, highlighting tasks, datasets, models, challenges, and future directions in the interdisciplinary field.

Contribution

It provides a comprehensive overview of RAC in vision and language, summarizing existing work, benchmarks, and identifying key challenges and future research directions.

Findings

01

Summarizes current tasks and datasets in RAC

02

Analyzes performance of various models

03

Discusses challenges and future directions

Abstract

'Actions' play a vital role in how humans interact with the world and enable them to achieve desired goals. As a result, most common sense (CS) knowledge for humans revolves around actions. While 'Reasoning about Actions & Change' (RAC) has been widely studied in the Knowledge Representation community, it has recently piqued the interest of NLP and computer vision researchers. This paper surveys existing tasks, benchmark datasets, various techniques and models, and their respective performance concerning advancements in RAC in the vision and language domain. Towards the end, we summarize our key takeaways, discuss the present challenges facing this research area, and outline potential directions for future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Topic Modeling · Multimodal Machine Learning Applications