Look Ahead Text Understanding and LLM Stitching

Junlin Julian Jiang (Piedmont High School; Piedmont; CA; USA); Xin Li; (College of Business; City University of Hong Kong; Hong Kong; China)

arXiv:2412.17836·cs.CL·December 25, 2024

Look Ahead Text Understanding and LLM Stitching

Junlin Julian Jiang (Piedmont High School, Piedmont, CA, USA), Xin Li, (College of Business, City University of Hong Kong, Hong Kong, China)

PDF

1 Repo

TL;DR

This paper introduces the look ahead section identification (LASI) problem, demonstrating how combining bidirectional and unidirectional transformer models improves understanding of developing texts, especially under noisy conditions.

Contribution

It proposes novel methods to stitch BERT and GPT models for look ahead text understanding, outperforming existing models in noisy text scenarios.

Findings

01

Our approach outperforms established models in noisy text conditions.

02

Combining bidirectional and unidirectional models benefits look ahead understanding.

03

The methods have potential applications in social media sentiment analysis.

Abstract

This paper proposes a look ahead text understanding problem with look ahead section identification (LASI) as an example. This problem may appear in generative AI as well as human interactions, where we want to understand the direction of a developing text or conversation. We tackle the problem using transformer-based LLMs. We show that LASI is more challenging than classic section identification (SI). We argue that both bidirectional contextual information (e.g., BERT) and unidirectional predictive ability (e.g., GPT) will benefit the task. We propose two approaches to stitch together BERT and GPT. Experiments show that our approach outperforms the established models, especially when there is noise in the text (which is often the case for developing text in generative AI). Our paper sheds light on other look ahead text understanding tasks that are important to social media, such as look…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Julian-JJ/LLM_Look_Ahead_Classification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Byte Pair Encoding · Discriminative Fine-Tuning · Linear Layer · Softmax · Dense Connections · GPT