Can Language Models Teach Weaker Agents? Teacher Explanations Improve   Students via Personalization

Swarnadeep Saha; Peter Hase; Mohit Bansal

arXiv:2306.09299·cs.CL·November 15, 2023·1 cites

Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization

Swarnadeep Saha, Peter Hase, Mohit Bansal

PDF

Open Access 1 Repo

TL;DR

This paper investigates how large language models can effectively teach weaker agents through personalized explanations, demonstrating that strategic, targeted interventions improve student performance and generalization, with potential risks if misaligned.

Contribution

It introduces a framework for LLM teachers to personalize and time explanations, showing how these strategies enhance learning and generalization for weaker student agents.

Findings

01

Teacher interventions improve student reasoning performance.

02

Personalized explanations outperform generic ones.

03

Misaligned teachers can harm student learning.

Abstract

A hallmark property of explainable AI models is the ability to teach other agents, communicating knowledge of how to perform a task. While Large Language Models perform complex reasoning by generating explanations for their predictions, it is unclear whether they also make good teachers for weaker agents. To address this, we consider a student-teacher framework between two LLM agents and study if, when, and how the teacher should intervene with natural language explanations to improve the student's performance. Since communication is expensive, we define a budget such that the teacher only communicates explanations for a fraction of the data, after which the student should perform well on its own. We decompose the teaching problem along four axes: (1) if teacher's test time intervention improve student predictions, (2) when it is worth explaining a data point, (3) how the teacher should…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

swarnahub/explanationintervention
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Online Learning and Analytics