Leveraging semantic similarity for experimentation with AI-generated treatments
Lei Shi, David Arbour, Raghavendra Addanki, Ritwik Sinha, Avi Feller

TL;DR
This paper introduces a novel kernel-based representation learning method for high-dimensional treatments generated by LLMs, enabling more effective online experimentation and causal effect estimation.
Contribution
It proposes double kernel representation learning with an efficient algorithm and convergence guarantees, advancing the analysis of complex AI-generated treatments.
Findings
Effective low-dimensional representations of treatments
Improved adaptive online experimentation strategies
Numerical experiments demonstrating method efficacy
Abstract
Large Language Models (LLMs) enable a new form of digital experimentation where treatments combine human and model-generated content in increasingly sophisticated ways. The main methodological challenge in this setting is representing these high-dimensional treatments without losing their semantic meaning or rendering analysis intractable. Here, we address this problem by focusing on learning low-dimensional representations that capture the underlying structure of such treatments. These representations enable downstream applications such as guiding generative models to produce meaningful treatment variants and facilitating adaptive assignment in online experiments. We propose double kernel representation learning, which models the causal effect through the inner product of kernel-based representations of treatments and user covariates. We develop an alternating-minimization algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Machine Learning in Materials Science · Model Reduction and Neural Networks
