Variance-Aware LLM Annotation for Strategy Research: Sources, Diagnostics, and a Protocol for Reliable Measurement
Arnaldo Camuffo, Alfonso Gambardella, Saeid Kazemi, Jakub Malachowski, Abhinav Pandey

TL;DR
This paper identifies sources of variance in LLM-generated annotations for strategy research, demonstrating how design choices impact results and proposing a protocol for reliable, reproducible measurement using LLMs.
Contribution
It introduces a variance-aware protocol for LLM annotation, addressing instability issues and establishing standards for reliable measurement in strategy research.
Findings
Minor design choices can shift outcomes by 12-85 percentage points
Variance sources threaten reproducibility and bias econometric estimates
Proposes a protocol with sampling, aggregation, and reporting standards
Abstract
Large language models (LLMs) offer strategy researchers powerful tools for annotating text at scale, but treating LLM-generated labels as deterministic overlooks substantial instability. Grounded in content analysis and generalizability theory, we diagnose five variance sources: construct specification, interface effects, model preferences, output extraction, and system-level aggregation. Empirical demonstrations show that minor design choices-prompt phrasing, model selection-can shift outcomes by 12-85 percentage points. Such variance threatens not only reproducibility but econometric identification: annotation errors correlated with covariates bias parameter estimates regardless of average accuracy. We develop a variance-aware protocol specifying sampling budgets, aggregation rules, and reporting standards, and delineate scope conditions where LLM annotation should not be used. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Forecasting Techniques and Applications · Machine Learning in Materials Science
