Enhancing LLM's Cognition via Structurization

Kai Liu; Zhihang Fu; Chao Chen; Wei Zhang; Rongxin Jiang; Fan Zhou,; Yaowu Chen; Yue Wu; Jieping Ye

arXiv:2407.16434·cs.CL·November 1, 2024

Enhancing LLM's Cognition via Structurization

Kai Liu, Zhihang Fu, Chao Chen, Wei Zhang, Rongxin Jiang, Fan Zhou,, Yaowu Chen, Yue Wu, Jieping Ye

PDF

1 Repo 2 Models 1 Datasets 1 Video

TL;DR

This paper introduces context structurization, transforming unorganized text into hierarchical structures to improve LLM understanding, leading to significant performance improvements across various models and tasks.

Contribution

It proposes a novel context structurization method that enhances LLM cognition by organizing input data hierarchically, which is validated through extensive experiments.

Findings

01

Performance gains across multiple NLP tasks

02

LLaMA2-70B matches GPT-3.5-Turbo in hallucination evaluation

03

Effective distillation into smaller models like StruXGPT-7B

Abstract

When reading long-form text, human cognition is complex and structurized. While large language models (LLMs) process input contexts through a causal and sequential perspective, this approach can potentially limit their ability to handle intricate and complex inputs effectively. To enhance LLM's cognition capability, this paper presents a novel concept of context structurization. Specifically, we transform the plain, unordered contextual sentences into well-ordered and hierarchically structurized elements. By doing so, LLMs can better grasp intricate and extended contexts through precise attention and information-seeking along the organized structures. Extensive evaluations are conducted across various model architectures and sizes (including a series of auto-regressive LLMs as well as BERT-like masking models) on a diverse set of NLP tasks (e.g., context-based question-answering,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alibaba/struxgpt
jaxOfficial

Models

Datasets

martineden/structurized_squad
dataset· 12 dl
12 dl

Videos

Enhancing LLM’s Cognition via Structurization· slideslive

Taxonomy

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Sparse Evolutionary Training · Cosine Annealing · Linear Warmup With Cosine Annealing · Residual Connection · Dropout · Adam · Byte Pair Encoding · Layer Normalization · Linear Layer