LLM_annotate: A Python package for annotating and analyzing fiction characters

Hannes Rosenbusch

arXiv:2601.03274·cs.CL·January 8, 2026

LLM_annotate: A Python package for annotating and analyzing fiction characters

Hannes Rosenbusch

PDF

Open Access

TL;DR

LLM_annotate is a Python package that streamlines the process of annotating and analyzing fictional characters' personalities using large language models, with tools for text processing, trait inference, and quality validation.

Contribution

It introduces a comprehensive, customizable workflow for character analysis in fiction, integrating LLM-based annotation, disambiguation, and quality assessment within an accessible GUI.

Findings

01

Enables efficient annotation of character traits in full texts.

02

Supports validation of annotations through human-in-the-loop GUI.

03

Demonstrates applicability with examples from movies and novels.

Abstract

LLM_annotate is a Python package for analyzing the personality of fiction characters with large language models. It standardizes workflows for annotating character behaviors in full texts (e.g., books and movie scripts), inferring character traits, and validating annotation/inference quality via a human-in-the-loop GUI. The package includes functions for text chunking, LLM-based annotation, character name disambiguation, quality scoring, and computation of character-level statistics and embeddings. Researchers can use any LLM, commercial, open-source, or custom, within LLM_annotate. Through tutorial examples using The Simpsons Movie and the novel Pride and Prejudice, I demonstrate the usage of the package for efficient and reproducible character analyses.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Digital Humanities and Scholarship · Computational and Text Analysis Methods