Supernova Event Dataset: Interpreting Large Language Models' Personality through Critical Event Analysis
Pranav Agarwal, Ioana Ciuc\u{a}

TL;DR
This paper introduces the Supernova Event Dataset to interpret LLM personalities by benchmarking their event extraction and ranking abilities, revealing distinct traits and enhancing interpretability across models.
Contribution
The work presents a novel dataset and framework for analyzing LLM personalities through event interpretation, incorporating a judge-based evaluation method for deeper insight.
Findings
Orca 2 shows emotional reasoning traits.
Qwen 2.5 exhibits strategic, analytical style.
Claude Sonnet 3.7 emphasizes conceptual framing.
Abstract
Large Language Models (LLMs) are increasingly integrated into everyday applications. As their influence grows, understanding their decision making and underlying personality becomes essential. In this work, we interpret model personality using our proposed Supernova Event Dataset, a novel dataset with diverse articles spanning biographies, historical events, news, and scientific discoveries. We use this dataset to benchmark LLMs on extracting and ranking key events from text, a subjective and complex challenge that requires reasoning over long-range context and modeling causal chains. We evaluate small models like Phi-4, Orca 2, and Qwen 2.5, and large, stronger models such as Claude 3.7, Gemini 2.5, and OpenAI o3, and propose a framework where another LLM acts as a judge to infer each model's personality based on its selection and classification of events. Our analysis shows distinct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGamma-ray bursts and supernovae
