Word-Embeddings Distinguish Denominal and Root-Derived Verbs in Semitic
Ido Benbaji (MIT), Omri Doron (MIT), Ad\`ele H\'enot-Mortier (MIT)

TL;DR
This paper investigates whether Hebrew word embeddings encode morphological distinctions between denominal and root-derived verbs, finding that multiple models successfully capture these semantic relationships, supporting the idea that embeddings reflect complex morphological semantics.
Contribution
It provides empirical evidence that Hebrew word embeddings encode morphological distinctions between denominal and root-derived verbs, validating the two-level morphological hypothesis.
Findings
Embeddings encode semantic relationships consistent with morphological derivation.
Four models (fastText, GloVe, Word2Vec, AlephBERT) verify the hypothesis.
Embeddings reflect complex morphological semantic properties.
Abstract
Proponents of the Distributed Morphology framework have posited the existence of two levels of morphological word formation: a lower one, leading to loose input-output semantic relationships; and an upper one, leading to tight input-output semantic relationships. In this work, we propose to test the validity of this assumption in the context of Hebrew word embeddings. If the two-level hypothesis is borne out, we expect state-of-the-art Hebrew word embeddings to encode (1) a noun, (2) a denominal derived from it (via an upper-level operation), and (3) a verb related to the noun (via a lower-level operation on the noun's root), in such a way that the denominal (2) should be closer in the embedding space to the noun (1) than the related verb (3) is to the same noun (1). We report that this hypothesis is verified by four embedding models of Hebrew: fastText, GloVe, Word2Vec and AlephBERT.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest · fastText · GloVe Embeddings
