StructLM: Towards Building Generalist Models for Structured Knowledge   Grounding

Alex Zhuang; Ge Zhang; Tianyu Zheng; Xinrun Du; Junjie Wang; Weiming; Ren; Stephen W. Huang; Jie Fu; Xiang Yue; Wenhu Chen

arXiv:2402.16671·cs.CL·October 8, 2024·1 cites

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Alex Zhuang, Ge Zhang, Tianyu Zheng, Xinrun Du, Junjie Wang, Weiming, Ren, Stephen W. Huang, Jie Fu, Xiang Yue, Wenhu Chen

PDF

Open Access 10 Models 2 Datasets

TL;DR

This paper introduces StructLM, a series of models trained on a large instruction tuning dataset to improve large language models' ability to interpret and utilize structured data like tables and graphs, achieving state-of-the-art results.

Contribution

The paper presents a new instruction tuning dataset of 1.1 million examples and trains StructLM models that outperform existing models on structured knowledge grounding tasks.

Findings

01

StructLM surpasses task-specific models on 16 out of 18 datasets.

02

Achieves new state-of-the-art on 8 SKG tasks.

03

Scaling model size yields marginal improvements.

Abstract

Structured data sources, such as tables, graphs, and databases, are ubiquitous knowledge sources. Despite the demonstrated capabilities of large language models (LLMs) on plain text, their proficiency in interpreting and utilizing structured data remains limited. Our investigation reveals a notable deficiency in LLMs' ability to process structured data, e.g., ChatGPT lags behind state-of-the-art (SoTA) model by an average of 35%. To augment the Structured Knowledge Grounding (SKG) capabilities in LLMs, we have developed a comprehensive instruction tuning dataset comprising 1.1 million examples. Utilizing this dataset, we train a series of models, referred to as StructLM, based on the Mistral and the CodeLlama model family, ranging from 7B to 34B parameters. Our StructLM series surpasses task-specific models on 16 out of 18 evaluated datasets and establishes new SoTA performance on 8 SKG…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies