Doppelg\"anger's Watch: A Split Objective Approach to Large Language   Models

Shervin Ghasemlou; Ashish Katiyar; Aparajita Saraf; Seungwhan Moon,; Mangesh Pujari; Pinar Donmez; Babak Damavandi; Anuj Kumar

arXiv:2409.06107·cs.CL·September 11, 2024

Doppelg\"anger's Watch: A Split Objective Approach to Large Language Models

Shervin Ghasemlou, Ashish Katiyar, Aparajita Saraf, Seungwhan Moon,, Mangesh Pujari, Pinar Donmez, Babak Damavandi, Anuj Kumar

PDF

Open Access

TL;DR

This paper introduces Doppelg"anger, a bicameral architecture for large language models that separates supervision signals from core capabilities, aiming to improve generation supervision.

Contribution

The paper proposes a novel split-objective architecture with a parallel module to enhance supervision in large language models, supported by theoretical analysis.

Findings

01

Theoretical insights into the split-objective approach.

02

Doppelg"anger predicts supervision scores concurrently.

03

Framework aims to improve generation supervision quality.

Abstract

In this paper, we investigate the problem of "generation supervision" in large language models, and present a novel bicameral architecture to separate supervision signals from their core capability, helpfulness. Doppelg\"anger, a new module parallel to the underlying language model, supervises the generation of each token, and learns to concurrently predict the supervision score(s) of the sequences up to and including each token. In this work, we present the theoretical findings, and leave the report on experimental results to a forthcoming publication.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling