On the Importance of Karaka Framework in Multi-modal Grounding

Sai Kiran Gorthi; Radhika Mamidi

arXiv:2204.04347·cs.CL·April 12, 2022·1 cites

On the Importance of Karaka Framework in Multi-modal Grounding

Sai Kiran Gorthi, Radhika Mamidi

PDF

Open Access

TL;DR

This paper explores the potential benefits of the Karaka Framework, based on the Computational Paninian Grammar model, for improving multi-modal grounding in vision-language navigation tasks, an area with limited prior study.

Contribution

It introduces a novel investigation into the application of the CPG dependency scheme in multi-modal vision-language tasks, highlighting its potential advantages and challenges.

Findings

01

Potential for more semantically aligned dependency relations

02

Insights into the applicability of CPG in multi-modal tasks

03

Foundation for future empirical evaluation

Abstract

Computational Paninian Grammar model helps in decoding a natural language expression as a series of modifier-modified relations and therefore facilitates in identifying dependency relations closer to language (context) semantics compared to the usual Stanford dependency relations. However, the importance of this CPG dependency scheme has not been studied in the context of multi-modal vision and language applications. At IIIT Hyderabad, we plan to perform a novel study to explore the potential advantages and disadvantages of CPG framework in a vision-language navigation task setting, a popular and challenging multi-modal grounding task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Natural Language Processing Techniques