Deep Sparse Latent Feature Models for Knowledge Graph Completion

Haotian Li; Rui Zhang; Lingzhi Wang; Bin Yu; Youwei Wang; Yuliang Wei; Kai Wang; Richard Yi Da Xu; Bailing Wang

arXiv:2411.15694·cs.CL·June 16, 2025

Deep Sparse Latent Feature Models for Knowledge Graph Completion

Haotian Li, Rui Zhang, Lingzhi Wang, Bin Yu, Youwei Wang, Yuliang Wei, Kai Wang, Richard Yi Da Xu, Bailing Wang

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a deep variational autoencoder-based sparse latent feature model for knowledge graph completion, effectively combining global structural information with local textual features to improve link prediction and interpretability.

Contribution

It presents a novel probabilistic framework that integrates global clustering with local textual features using a deep VAE, advancing knowledge graph completion methods.

Findings

01

Significant performance improvements on benchmark datasets

02

Enhanced interpretability of latent structures

03

Effective integration of global and local features

Abstract

Recent advances in knowledge graph completion (KGC) have emphasized text-based approaches to navigate the inherent complexities of large-scale knowledge graphs (KGs). While these methods have achieved notable progress, they frequently struggle to fully incorporate the global structural properties of the graph. Stochastic blockmodels (SBMs), especially the latent feature relational model (LFRM), offer robust probabilistic frameworks for identifying latent community structures and improving link prediction. This paper presents a novel probabilistic KGC framework utilizing sparse latent feature models, optimized via a deep variational autoencoder (VAE). Our proposed method dynamically integrates global clustering information with local textual features to effectively complete missing triples, while also providing enhanced interpretability of the underlying latent structures. Extensive…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 5

Strengths

1. The proposed method aims to balance the retention of critical knowledge with the elimination of redundancy, which is an interesting topic. 2. The authors not only effectively complete missing triples but also provide clear interpretability of the latent structures which seems reasonable.

Weaknesses

1. The paper is not organized clearly, which is not friendly for understanding. For example, there is a lack of preliminary details for how to model MB and other module in 3.1 GENERATIVE MODEL. 2. Figure 2 lacks of explanation, \textit{e.g.,} how the modules work together and match the equations in the main paper. The paper lacks the necessary reproduction file for the results. 3. The paper lacks the analysis of time complexity as well as space complexity, which is necessary to study the effic

Reviewer 02Rating 6Confidence 3

Strengths

1.The paper is clearly written and easy to follow. 2. The theoretical foundation of this paper is solid, utilizing extensive mathematical formulations to elucidate the structure of the proposed model or to demonstrate its validity. 3. The experiments conducted on benchmark datasets demonstrate the effectiveness of DSLFM-KGC in improving KGC performance and uncovering interpretable latent structures.

Weaknesses

1.The paper repeatedly emphasizes the advantages of the proposed model on large-scale graphs; therefore, it would be beneficial to include comparative experiments on time complexity or runtime performance between the proposed model and the baseline. 2. At the end of the introduction section, a more direct listing of the contributions of this paper should be provided, with particular emphasis on the novel points introduced for the first time in this work. 3. The paper uses more baselines on WN1

Reviewer 03Rating 5Confidence 5

Strengths

Strengths 1. Experimental Results: The model shows significant performance improvements on the Wikidata5M dataset (e.g., a 5% increase in MRR and a 6.5% increase in Hit@1), with similar results observed on the WN18RR dataset. 2. Scalability: The VAE framework enables the model to perform inference on large-scale knowledge graphs, demonstrating good scalability.

Weaknesses

Weaknesses 1. Model motivation: SBM is well-known graph clustering algorithm. There are many works that combine GNN with SBM to achieve community detection. Except from the used data structure, the key difference between the proposed method and existing works is not obvious. 2. Model Complexity: The introduction of various latent feature sampling and inference mechanisms makes the inference process relatively complex, necessitating more efficient training strategies to speed up training and r

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Semantic Web and Ontologies · Graph Theory and Algorithms