THIVLVC: Retrieval Augmented Dependency Parsing for Latin

Luc Pommeret (STL); Thibault Wagret (ENS de Lyon; HiSoMA); Jules Deret

arXiv:2604.05564·cs.CL·April 8, 2026

THIVLVC: Retrieval Augmented Dependency Parsing for Latin

Luc Pommeret (STL), Thibault Wagret (ENS de Lyon, HiSoMA), Jules Deret

PDF

TL;DR

THIVLVC is a two-stage retrieval-augmented system that improves Latin dependency parsing by leveraging similar sentence examples and large language models, significantly enhancing accuracy on poetry texts.

Contribution

It introduces a retrieval-augmented approach for Latin dependency parsing, combining sentence similarity retrieval with LLM refinement, achieving notable accuracy improvements.

Findings

01

+17 CLAS points on poetry (Seneca) over baseline

02

+1.5 CLAS points on prose (Thomas Aquinas)

03

Error analysis shows 53.3% of divergences favor THIVLVC among unanimous decisions.

Abstract

We describe THIVLVC, a two-stage system for the EvaLatin 2026 Dependency Parsing task. Given a Latin sentence, we retrieve structurally similar entries from the CIRCSE treebank using sentence length and POS n-gram similarity, then prompt a large language model to refine the baseline parse from UDPipe using the retrieved examples and UD annotation guidelines. We submit two configurations: one without retrieval and one with retrieval (RAG). On poetry (Seneca), THIVLVC improves CLAS by +17 points over the UDPipe baseline; on prose (Thomas Aquinas), the gain is +1.5 CLAS. A double-blind error analysis of 300 divergences between our system and the gold standard reveals that, among unanimous annotator decisions, 53.3% favour THIVLVC, showing annotation inconsistencies both within and across treebanks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.