NodeSynth: Socially Aligned Synthetic Data for AI Evaluation

Qazi Mamunur Rashid; Xuan Yang; Zhengzhe Yang; Yanzhou Pan; Erin van Liemt; Darlene Neal; Kshitij Pancholi; Jamila Smith-Loud

arXiv:2605.14381·cs.LG·May 19, 2026

NodeSynth: Socially Aligned Synthetic Data for AI Evaluation

Qazi Mamunur Rashid, Xuan Yang, Zhengzhe Yang, Yanzhou Pan, Erin van Liemt, Darlene Neal, Kshitij Pancholi, Jamila Smith-Loud

PDF

1 Repo

TL;DR

NodeSynth is a novel method for generating socially relevant synthetic data to evaluate AI models, revealing significant failure modes and deficiencies in safety guard models, with open-source tools for scalable assessment.

Contribution

We introduce NodeSynth, a new evidence-grounded approach for creating socially nuanced synthetic queries, enhancing AI evaluation in sensitive domains.

Findings

01

NodeSynth elicited failure rates up to five times higher than human benchmarks.

02

Granular taxonomic expansion significantly impacts failure rates.

03

Validation shows deficiencies in prominent guard models like Llama-Guard-3.

Abstract

Recent advancements in generative AI facilitate large-scale synthetic data generation for model evaluation. However, without targeted approaches, these datasets often lack the sociotechnical nuance required for sensitive domains. We introduce NodeSynth, an evidence-grounded methodology that generates socially relevant synthetic queries by leveraging a fine-tuned taxonomy generator (TaG) anchored in real-world evidence. Evaluated against four mainstream LLMs (e.g., Claude 4.5 Haiku), NodeSynth elicited failure rates up to five times higher than human-authored benchmarks. Ablation studies confirm that our granular taxonomic expansion significantly drives these failure rates, while independent validation reveals critical deficiencies in prominent guard models (e.g., Llama-Guard-3). We open-source our end-to-end research prototype and datasets to enable scalable, high-stakes model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/nodesynth
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.