Unused information in token probability distribution of generative LLM:   improving LLM reading comprehension through calculation of expected values

Krystian Zawistowski

arXiv:2406.10267·cs.CL·September 27, 2024

Unused information in token probability distribution of generative LLM: improving LLM reading comprehension through calculation of expected values

Krystian Zawistowski

PDF

Open Access 1 Repo

TL;DR

This paper shows that manipulating token probabilities and calculating expected values can significantly improve LLM reading comprehension and decoding quality, outperforming some existing models on specific metrics.

Contribution

It introduces a method of using expected token values and probability-based sampling to enhance LLM decoding and comprehension performance.

Findings

01

Improved correlation with human judgment on SummEval dataset.

02

Scaling logits with temperature increases entropy and decoding quality.

03

Probability-based tree sampling explores multiple likely generations.

Abstract

LLM text decoding is key component for perceived LLM quality. We demonstrate two experiments showing that decoding methods could be improved by manipulation of token probabilities. First, we test few LLM on SummEval summary scoring dataset, to measure reading comprehension. We compare scores from greedy decoding to expected values over the next token distribution. We scale logits by large temperature to increase the entropy of scores. This allows strong improvement of performance on SummEval (in terms of correlations to human judgement). We see improvement from 6-8% to 13-28% for 7B Mistral and from 20%-46% to 37%-56% for Mixtral, beating GPT 4 0314 result on two metrics. Part of the gain seems related to positional bias. Secondly, we use probability-based tree sampling algorithm, to examine all most probable generations for given prompt.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kzawisto/unused_information_llm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification

MethodsAttention Is All You Need · Linear Layer · Cosine Annealing · Multi-Head Attention · Weight Decay · Linear Warmup With Cosine Annealing · Adam · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Byte Pair Encoding