PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars
Sumanth Prabhu

TL;DR
PEDAL is a hybrid method that improves large language model text generation by combining diverse exemplars in prompts with LLM-based aggregation, achieving better accuracy than greedy decoding and lower inference cost than self-consistency methods.
Contribution
The paper introduces PEDAL, a novel hybrid self-ensembling approach that leverages diverse exemplars and LLM aggregation to enhance text generation performance.
Findings
PEDAL outperforms greedy decoding in accuracy.
PEDAL has lower inference cost than self-consistency methods.
PEDAL achieves state-of-the-art results on SVAMP and ARC datasets.
Abstract
Self-ensembling techniques with diverse reasoning paths such as Self-Consistency have demonstrated remarkable performance gains in text generation with Large Language Models (LLMs). However, such techniques depend on the availability of an accurate answer extraction process to aggregate across multiple outputs. Moreover, they acquire higher inference cost, in comparison to Greedy Decoding, due to generation of relatively higher number of output tokens. Research has shown that the free form text outputs from Self-Consistency can be aggregated reliably using LLMs to produce the final output. Additionally, recent advancements in LLM inference have demonstrated that usage of diverse exemplars in prompts have the ability to induce diversity in the LLM outputs. Such proven techniques can be easily extended to self-ensembling based approaches to achieve enhanced results in text generation. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
