Automated Analysis of Global AI Safety Initiatives: A Taxonomy-Driven LLM Approach
Takayuki Semitsu, Naoto Kiribuchi, Kengo Zenitani

TL;DR
This paper introduces an automated framework using large language models to compare AI safety policy documents based on a shared activity taxonomy, assessing model stability and human agreement.
Contribution
The work develops a novel LLM-based crosswalk system for policy comparison, evaluating its stability, validity, and alignment with human judgments across multiple documents.
Findings
Model choice significantly influences crosswalk results.
Some document pairs have high model disagreement.
Human evaluation shows high inter-annotator agreement.
Abstract
We present an automated crosswalk framework that compares an AI safety policy document pair under a shared taxonomy of activities. Using the activity categories defined in Activity Map on AI Safety as fixed aspects, the system extracts and maps relevant activities, then produces for each aspect a short summary for each document, a brief comparison, and a similarity score. We assess the stability and validity of LLM-based crosswalk analysis across public policy documents. Using five large language models, we perform crosswalks on ten publicly available documents and visualize mean similarity scores with a heatmap. The results show that model choice substantially affects the crosswalk outcomes, and that some document pairs yield high disagreements across models. A human evaluation by three experts on two document pairs shows high inter-annotator agreement, while model scores still differ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
