LLaMA based Punctuation Restoration With Forward Pass Only Decoding

Yutong Pang; Debjyoti Paul; Kevin Jiang; Xuedong Zhang; Xin Lei

arXiv:2408.11845·cs.CL·August 23, 2024

LLaMA based Punctuation Restoration With Forward Pass Only Decoding

Yutong Pang, Debjyoti Paul, Kevin Jiang, Xuedong Zhang, Xin Lei

PDF

Open Access

TL;DR

This paper demonstrates that LLaMA can be effectively used for punctuation restoration, and introduces Forward Pass Only Decoding (FPOD), a novel method that significantly accelerates inference speed while reducing hallucinations.

Contribution

The paper applies LLaMA to punctuation restoration and proposes FPOD, a new decoding approach that greatly improves inference speed and reduces hallucinations.

Findings

01

LLaMA outperforms benchmarks in punctuation restoration.

02

FPOD achieves a 19.8x speedup in inference.

03

FPOD reduces hallucinations during decoding.

Abstract

This paper introduces two advancements in the field of Large Language Model Annotation with a focus on punctuation restoration tasks. Our first contribution is the application of LLaMA for punctuation restoration, which demonstrates superior performance compared to the established benchmark. Despite its impressive quality, LLaMA faces challenges regarding inference speed and hallucinations. To address this, our second contribution presents Forward Pass Only Decoding (FPOD), a novel decoding approach for annotation tasks. This innovative method results in a substantial 19.8x improvement in inference speed, effectively addressing a critical bottleneck and enhancing the practical utility of LLaMA for large-scale data annotation tasks without hallucinations. The combination of these contributions not only solidifies LLaMA as a powerful tool for punctuation restoration but also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Distributed systems and fault tolerance · Logic, programming, and type systems

MethodsLLaMA · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus