LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction

Luc Pommeret (STL); Thomas Gerald (LISN); Patrick Paroubek (STL); Sahar Ghannay (STL); Christophe Servan (STL; AMIAD); Sophie Rosset (LISN; STL)

arXiv:2604.02866·cs.CL·April 6, 2026

LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction

Luc Pommeret (STL), Thomas Gerald (LISN), Patrick Paroubek (STL), Sahar Ghannay (STL), Christophe Servan (STL, AMIAD), Sophie Rosset (LISN, STL)

PDF

TL;DR

This paper demonstrates that decomposing text into atomic propositions enhances triplet extraction, especially for weaker extractors, and introduces a multilingual model, MPropositionneur-V2, to facilitate this process.

Contribution

The paper introduces MPropositionneur-V2, a multilingual model trained via knowledge distillation, and evaluates how atomic propositions improve triplet extraction across various models and languages.

Findings

01

Atomic propositions improve relation recall for weaker extractors.

02

Multilingual atomic propositions increase overall accuracy in multilingual settings.

03

Fallback strategies help stronger models maintain entity recall while benefiting from propositions.

Abstract

Knowledge Graph construction from natural language requires extracting structured triplets from complex, information-dense sentences. In this paper, we investigate if the decomposition of text into atomic propositions (minimal, semantically autonomous units of information) can improve the triplet extraction. We introduce MPropositionneur-V2, a small multilingual model covering six European languages trained by knowledge distillation from Qwen3-32B into a Qwen3-0.6B architecture, and we evaluate its integration into two extraction paradigms: entity-centric (GLiREL) and generative (Qwen3). Experiments on SMiLER, FewRel, DocRED and CaRB show that atomic propositions benefit weaker extractors (GLiREL, CoreNLP, 0.6B models), improving relation recall and, in the multilingual setting, overall accuracy. For stronger LLMs, a fallback combination strategy recovers entity recall losses while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.