# Modeling the Complexity and Descriptive Adequacy of Construction   Grammars

**Authors:** Jonathan Dunn

arXiv: 1904.05588 · 2019-04-12

## TL;DR

This paper applies the Minimum Description Length principle to evaluate and discover Construction Grammars across multiple languages, demonstrating that more complex, multi-level grammars offer better generalization and compression of unannotated corpora.

## Contribution

It introduces a novel MDL-based framework for modeling and discovering Construction Grammars, emphasizing the importance of complexity and multi-level representations.

## Key findings

- Complex CxGs achieve significant data compression.
- Multi-level CxGs outperform single-level models.
- The approach supports grammar discovery for multiple languages.

## Abstract

This paper uses the Minimum Description Length paradigm to model the complexity of CxGs (operationalized as the encoding size of a grammar) alongside their descriptive adequacy (operationalized as the encoding size of a corpus given a grammar). These two quantities are combined to measure the quality of potential CxGs against unannotated corpora, supporting discovery-device CxGs for English, Spanish, French, German, and Italian. The results show (i) that these grammars provide significant generalizations as measured using compression and (ii) that more complex CxGs with access to multiple levels of representation provide greater generalizations than single-representation CxGs.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.05588/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1904.05588/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1904.05588/full.md

---
Source: https://tomesphere.com/paper/1904.05588