Doppelg\"anger's Watch: A Split Objective Approach to Large Language Models
Shervin Ghasemlou, Ashish Katiyar, Aparajita Saraf, Seungwhan Moon,, Mangesh Pujari, Pinar Donmez, Babak Damavandi, Anuj Kumar

TL;DR
This paper introduces Doppelg"anger, a bicameral architecture for large language models that separates supervision signals from core capabilities, aiming to improve generation supervision.
Contribution
The paper proposes a novel split-objective architecture with a parallel module to enhance supervision in large language models, supported by theoretical analysis.
Findings
Theoretical insights into the split-objective approach.
Doppelg"anger predicts supervision scores concurrently.
Framework aims to improve generation supervision quality.
Abstract
In this paper, we investigate the problem of "generation supervision" in large language models, and present a novel bicameral architecture to separate supervision signals from their core capability, helpfulness. Doppelg\"anger, a new module parallel to the underlying language model, supervises the generation of each token, and learns to concurrently predict the supervision score(s) of the sequences up to and including each token. In this work, we present the theoretical findings, and leave the report on experimental results to a forthcoming publication.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
