Quantifying syntax similarity with a polynomial representation of   dependency trees

Pengyu Liu; Tinghao Feng; Rui Liu

arXiv:2211.07005·cs.CL·November 15, 2022·1 cites

Quantifying syntax similarity with a polynomial representation of dependency trees

Pengyu Liu, Tinghao Feng, Rui Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a polynomial-based graph method to accurately quantify and compare syntactic structures in dependency trees, enabling cross-linguistic analysis and diversity measurement.

Contribution

It presents a novel polynomial representation for dependency trees that captures detailed syntactic information and facilitates syntax similarity and diversity analysis.

Findings

01

Effective differentiation of tree structures using the polynomial method

02

Application to multilingual datasets reveals syntactic similarities and differences

03

Potential for measuring syntax diversity across corpora

Abstract

We introduce a graph polynomial that distinguishes tree structures to represent dependency grammar and a measure based on the polynomial representation to quantify syntax similarity. The polynomial encodes accurate and comprehensive information about the dependency structure and dependency relations of words in a sentence. We apply the polynomial-based methods to analyze sentences in the Parallel Universal Dependencies treebanks. Specifically, we compare the syntax of sentences and their translations in different languages, and we perform a syntactic typology study of available languages in the Parallel Universal Dependencies treebanks. We also demonstrate and discuss the potential of the methods in measuring syntax diversity of corpora.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pliumath/dependencies
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Authorship Attribution and Profiling