An LLM's Apology: Outsourcing Awkwardness in the Age of AI
Twm Stone, Anna Soligo

TL;DR
This paper introduces FLAKE-Bench, an evaluation framework for assessing large language models' ability to generate socially acceptable excuses for canceling commitments across various scenarios, aiming to reduce social friction.
Contribution
It presents FLAKE-Bench, a new benchmark for measuring LLMs' effectiveness in socially delicate situations, and evaluates ten recent models' performance in this context.
Findings
LLMs can generate plausible excuses for canceling commitments.
Performance varies significantly across different models.
Open-source release of FLAKE-Bench facilitates future research.
Abstract
A key part of modern social dynamics is flaking at short notice. However, anxiety in coming up with believable and socially acceptable reasons to do so can instead lead to 'ghosting', awkwardness, or implausible excuses, risking emotional harm and resentment in the other party. The ability to delegate this task to a Large Language Model (LLM) could substantially reduce friction and enhance the flexibility of user's social life while greatly minimising the aforementioned creative burden and moral qualms. We introduce FLAKE-Bench, an evaluation of models' capacity to effectively, kindly, and humanely extract themselves from a diverse set of social, professional and romantic scenarios. We report the efficacy of 10 frontier or recently-frontier LLMs in bailing on prior commitments, because nothing says "I value our friendship" like having AI generate your cancellation texts. We open-source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLaw, AI, and Intellectual Property · Corporate Insolvency and Governance · Artificial Intelligence in Law
