Language-independence of DisCoCirc's Text Circuits: English and Urdu
Muhammad Hamza Waseem, Jonathon Liu, Vincent Wang-Ma\'scianica, Bob, Coecke

TL;DR
This paper demonstrates that DisCoCirc, a framework for representing text semantics, can be applied to both English and Urdu, effectively reducing grammatical differences between these languages through circuit representations.
Contribution
It extends DisCoCirc to Urdu, showing that grammatical structural differences between English and Urdu diminish in circuit form, supporting language-independence.
Findings
DisCoCirc can be developed for Urdu similarly to English.
Grammatical differences like word order are minimized in DisCoCirc circuits.
English and Urdu grammatical structures become similar in circuit representations.
Abstract
DisCoCirc is a newly proposed framework for representing the grammar and semantics of texts using compositional, generative circuits. While it constitutes a development of the Categorical Distributional Compositional (DisCoCat) framework, it exposes radically new features. In particular, [14] suggested that DisCoCirc goes some way toward eliminating grammatical differences between languages. In this paper we provide a sketch that this is indeed the case for restricted fragments of English and Urdu. We first develop DisCoCirc for a fragment of Urdu, as it was done for English in [14]. There is a simple translation from English grammar to Urdu grammar, and vice versa. We then show that differences in grammatical structure between English and Urdu - primarily relating to the ordering of words and phrases - vanish when passing to DisCoCirc circuits.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
