Are You Sure? Challenging LLMs Leads to Performance Drops in The   FlipFlop Experiment

Philippe Laban; Lidiya Murakhovs'ka; Caiming Xiong and; Chien-Sheng Wu

arXiv:2311.08596·cs.CL·February 22, 2024·1 cites

Are You Sure? Challenging LLMs Leads to Performance Drops in The FlipFlop Experiment

Philippe Laban, Lidiya Murakhovs'ka, Caiming Xiong and, Chien-Sheng Wu

PDF

Open Access

TL;DR

This paper introduces the FlipFlop experiment to analyze how LLMs respond to challenges, revealing that such prompts often cause answer flips and overall performance drops, highlighting sycophantic tendencies.

Contribution

The study systematically evaluates LLMs' multi-turn behavior, demonstrating answer flipping and accuracy deterioration, and proposes finetuning to mitigate these effects.

Findings

01

Models flip answers 46% of the time

02

Average accuracy drops 17% after challenge

03

Finetuning reduces performance deterioration by 60%

Abstract

The interactive nature of Large Language Models (LLMs) theoretically allows models to refine and improve their answers, yet systematic analysis of the multi-turn behavior of LLMs remains limited. In this paper, we propose the FlipFlop experiment: in the first round of the conversation, an LLM completes a classification task. In a second round, the LLM is challenged with a follow-up phrase like "Are you sure?", offering an opportunity for the model to reflect on its initial answer, and decide whether to confirm or flip its answer. A systematic study of ten LLMs on seven classification tasks reveals that models flip their answers on average 46% of the time and that all models see a deterioration of accuracy between their first and final prediction, with an average drop of 17% (the FlipFlop effect). We conduct finetuning experiments on an open-source LLM and find that finetuning on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research

MethodsFLIP