Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from   Disparate Training Data

Johannes Treutlein; Dami Choi; Jan Betley; Samuel Marks; Cem Anil,; Roger Grosse; Owain Evans

arXiv:2406.14546·cs.CL·December 24, 2024·3 cites

Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

Johannes Treutlein, Dami Choi, Jan Betley, Samuel Marks, Cem Anil,, Roger Grosse, Owain Evans

PDF

Open Access 1 Repo 8 Models

TL;DR

This paper investigates the ability of large language models to infer and verbalize implicit, latent information from scattered training data without in-context learning, revealing both capabilities and limitations relevant to safety and control.

Contribution

It introduces the concept of inductive out-of-context reasoning (OOCR) and demonstrates that LLMs can infer and verbalize hidden knowledge from limited, scattered data across various tasks.

Findings

01

LLMs can infer unknown city locations from distance data.

02

LLMs can verbalize bias in coin flips and define functions from limited data.

03

OOCR is unreliable for smaller models and complex structures.

Abstract

One way to address safety risks from large language models (LLMs) is to censor dangerous knowledge from their training data. While this removes the explicit information, implicit information can remain scattered across various training documents. Could an LLM infer the censored knowledge by piecing together these implicit hints? As a step towards answering this question, we study inductive out-of-context reasoning (OOCR), a type of generalization in which LLMs infer latent information from evidence distributed across training documents and apply it to downstream tasks without in-context learning. Using a suite of five tasks, we demonstrate that frontier LLMs can perform inductive OOCR. In one experiment we finetune an LLM on a corpus consisting only of distances between an unknown city and other known cities. Remarkably, without in-context examples or Chain of Thought, the LLM can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

choidami/inductive-oocr
noneOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsFLIP