LLM_annotate: A Python package for annotating and analyzing fiction characters
Hannes Rosenbusch

TL;DR
LLM_annotate is a Python package that streamlines the process of annotating and analyzing fictional characters' personalities using large language models, with tools for text processing, trait inference, and quality validation.
Contribution
It introduces a comprehensive, customizable workflow for character analysis in fiction, integrating LLM-based annotation, disambiguation, and quality assessment within an accessible GUI.
Findings
Enables efficient annotation of character traits in full texts.
Supports validation of annotations through human-in-the-loop GUI.
Demonstrates applicability with examples from movies and novels.
Abstract
LLM_annotate is a Python package for analyzing the personality of fiction characters with large language models. It standardizes workflows for annotating character behaviors in full texts (e.g., books and movie scripts), inferring character traits, and validating annotation/inference quality via a human-in-the-loop GUI. The package includes functions for text chunking, LLM-based annotation, character name disambiguation, quality scoring, and computation of character-level statistics and embeddings. Researchers can use any LLM, commercial, open-source, or custom, within LLM_annotate. Through tutorial examples using The Simpsons Movie and the novel Pride and Prejudice, I demonstrate the usage of the package for efficient and reproducible character analyses.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Digital Humanities and Scholarship · Computational and Text Analysis Methods
