Decision Quality Evaluation Framework at Pinterest

Yuqi Tian; Robert Paine; Attila Dobi; Kevin O'Sullivan; Aravindh Manickavasagam; Faisal Farooq

arXiv:2602.15809·stat.AP·February 18, 2026

Decision Quality Evaluation Framework at Pinterest

Yuqi Tian, Robert Paine, Attila Dobi, Kevin O'Sullivan, Aravindh Manickavasagam, Faisal Farooq

PDF

Open Access

TL;DR

This paper presents a comprehensive, data-driven framework for evaluating moderation decision quality at Pinterest, leveraging expert-curated benchmarks and automated sampling to improve trustworthiness and policy management at scale.

Contribution

The paper introduces a novel Decision Quality Evaluation Framework utilizing a high-trust Golden Set and propensity score sampling to enhance decision assessment for content moderation.

Findings

01

Benchmarking LLM cost-performance trade-offs

02

Establishing data-driven prompt optimization methodology

03

Ensuring policy content integrity through continuous validation

Abstract

Online platforms require robust systems to enforce content safety policies at scale. A critical component of these systems is the ability to evaluate the quality of moderation decisions made by both human agents and Large Language Models (LLMs). However, this evaluation is challenging due to the inherent trade-offs between cost, scale, and trustworthiness, along with the complexity of evolving policies. To address this, we present a comprehensive Decision Quality Evaluation Framework developed and deployed at Pinterest. The framework is centered on a high-trust Golden Set (GDS) curated by subject matter experts (SMEs), which serves as a ground truth benchmark. We introduce an automated intelligent sampling pipeline that uses propensity scores to efficiently expand dataset coverage. We demonstrate the framework's practical application in several key areas: benchmarking the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Spam and Phishing Detection