An Annotation Scheme and Classifier for Personal Facts in Dialogue

Konstantin Zaitsev

arXiv:2605.10339·cs.CL·May 12, 2026

An Annotation Scheme and Classifier for Personal Facts in Dialogue

Konstantin Zaitsev

PDF

TL;DR

This paper introduces an enhanced annotation scheme and a transformer-based classifier for personal facts in dialogue, improving accuracy over few-shot LLM baselines and enabling structured fact management.

Contribution

It extends existing schemes with new categories and attributes, and provides a high-performing, resource-efficient classifier trained on a large annotated dataset.

Findings

01

Classifier achieves 81.6% macro F1, outperforming GPT-5.4-mini by nearly 9 points.

02

New categories and attributes enable better structured storage and filtering of personal facts.

03

Error analysis highlights ongoing challenges in semantic boundary and pragmatic reasoning.

Abstract

The advancement of Large Language Models (LLMs) has enabled their application in personalized dialogue systems. We present an extended annotation scheme for personal fact classification that addresses limitations in existing approaches, particularly PeaCoK. Our scheme introduces new categories (Demographics, Possessions) and attributes (Duration, Validity, Followup) that enable structured storage, quality filtering, and identification of facts suitable for dialogue continuation. We manually annotated 2,779 facts from Multi-Session Chat and trained a multi-head classifier based on transformer encoders. Combined with the Gemma-300M encoder, the classifier achieves $81.6 \pm 2.6$ \% macro F1, outperforming all few-shot LLM baselines (best: GPT-5.4-mini, 72.92\%) by nearly 9 percentage points while requiring substantially fewer computational resources. Error analysis reveals persistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.