MASALA: Modelling and Analysing the Semantics of Adpositions in   Linguistic Annotation of Hindi

Aryaman Arora; Nitin Venkateswaran; Nathan Schneider

arXiv:2205.03955·cs.CL·May 10, 2022·1 cites

MASALA: Modelling and Analysing the Semantics of Adpositions in Linguistic Annotation of Hindi

Aryaman Arora, Nitin Venkateswaran, Nathan Schneider

PDF

Open Access

TL;DR

This paper introduces MASALA, a Hindi corpus annotated with semantic relations of adpositions using SNACS, and explores automatic labeling with language models, achieving competitive results and potential for broader linguistic applications.

Contribution

It provides a publicly available Hindi semantic adposition corpus annotated with SNACS and demonstrates effective automatic labeling using language models.

Findings

01

Achieved competitive SNACS supersense labeling results in Hindi.

02

Created a publicly available annotated corpus of Hindi adpositions.

03

Explored applications in semantic role labeling and extension to related languages.

Abstract

We present a completed, publicly available corpus of annotated semantic relations of adpositions and case markers in Hindi. We used the multilingual SNACS annotation scheme, which has been applied to a variety of typologically diverse languages. Building on past work examining linguistic problems in SNACS annotation, we use language models to attempt automatic labelling of SNACS supersenses in Hindi and achieve results competitive with past work on English. We look towards upstream applications in semantic role labelling and extension to related languages such as Gujarati.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems