GT2Vec: Large Language Models as Multi-Modal Encoders for Text and   Graph-Structured Data

Jiacheng Lin; Kun Qian; Haoyu Han; Nurendra Choudhary; Tianxin Wei,; Zhongruo Wang; Sahika Genc; Edward W Huang; Sheng Wang; Karthik Subbian,; Danai Koutra; Jimeng Sun

arXiv:2410.11235·cs.CL·February 12, 2025

GT2Vec: Large Language Models as Multi-Modal Encoders for Text and Graph-Structured Data

Jiacheng Lin, Kun Qian, Haoyu Han, Nurendra Choudhary, Tianxin Wei,, Zhongruo Wang, Sahika Genc, Edward W Huang, Sheng Wang, Karthik Subbian,, Danai Koutra, Jimeng Sun

PDF

Open Access

TL;DR

GT2Vec leverages large language models with contrastive learning to effectively encode and integrate text and graph data, improving performance across multiple tasks and datasets.

Contribution

Introduces GT2Vec, a novel framework using LLMs and contrastive learning for joint text and graph embedding, surpassing prior methods in effectiveness.

Findings

01

Outperforms existing baselines on six datasets

02

Achieves significant improvements in retrieval, classification, and question answering

03

Ablation studies confirm the effectiveness of the proposed components

Abstract

Graph-structured information offers rich contextual information that can enhance language models by providing structured relationships and hierarchies, leading to more expressive embeddings for various applications such as retrieval, question answering, and classification. However, existing methods for integrating graph and text embeddings, often based on Multi-layer Perceptrons (MLPs) or shallow transformers, are limited in their ability to fully exploit the heterogeneous nature of these modalities. To overcome this, we propose GT2Vec, a simple yet effective framework that leverages Large Language Models (LLMs) to jointly encode text and graph data. Specifically, GT2Vec employs an MLP adapter to project graph embeddings into the same space as text embeddings, allowing the LLM to process both modalities jointly. Unlike prior work, we also introduce contrastive learning to align the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies

MethodsAdapter · Contrastive Learning · ALIGN