Vulgaris: Analysis of a Corpus for Middle-Age Varieties of Italian   Language

Andrea Zugarini; Matteo Tiezzi; Marco Maggini

arXiv:2010.05993·cs.CL·October 14, 2020

Vulgaris: Analysis of a Corpus for Middle-Age Varieties of Italian Language

Andrea Zugarini, Matteo Tiezzi, Marco Maggini

PDF

Open Access 2 Repos

TL;DR

Vulgaris is a comprehensive corpus of medieval Italian texts from 1200-1600, enabling diachronic, dialectal, and stylistic linguistic analysis of regional and temporal language variations.

Contribution

This work introduces Vulgaris, a novel corpus of medieval Italian texts with detailed metadata, facilitating studies of language evolution and dialectal differences.

Findings

01

The corpus covers texts from 1200-1600 across Italy.

02

Authors are grouped by stylistic and chronological features.

03

Statistical analysis reveals regional and temporal language patterns.

Abstract

Italian is a Romance language that has its roots in Vulgar Latin. The birth of the modern Italian started in Tuscany around the 14th century, and it is mainly attributed to the works of Dante Alighieri, Francesco Petrarca and Giovanni Boccaccio, who are among the most acclaimed authors of the medieval age in Tuscany. However, Italy has been characterized by a high variety of dialects, which are often loosely related to each other, due to the past fragmentation of the territory. Italian has absorbed influences from many of these dialects, as also from other languages due to dominion of portions of the country by other nations, such as Spain and France. In this work we present Vulgaris, a project aimed at studying a corpus of Italian textual resources from authors of different regions, ranging in a time period between 1200 and 1600. Each composition is associated to its author, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Natural Language Processing Techniques · Linguistic Variation and Morphology