Neural PathSim for Inductive Similarity Search in Heterogeneous Information Networks
Wenyi Xiao, Huan Zhao, Vincent W. Zheng, Yangqiu Song

TL;DR
NeuPath is a novel neural network framework that efficiently approximates PathSim scores in large heterogeneous information networks, enabling faster similarity search and clustering.
Contribution
The paper introduces NeuPath, a learning-based approach that transforms PathSim computation into a trainable model considering the algorithmic structure, improving efficiency and accuracy.
Findings
NeuPath outperforms state-of-the-art baselines in PathSim approximation.
NeuPath achieves higher accuracy in similarity search tasks.
Experiments on ACM and IMDB datasets validate its effectiveness.
Abstract
PathSim is a widely used meta-path-based similarity in heterogeneous information networks. Numerous applications rely on the computation of PathSim, including similarity search and clustering. Computing PathSim scores on large graphs is computationally challenging due to its high time and storage complexity. In this paper, we propose to transform the problem of approximating the ground truth PathSim scores into a learning problem. We design an encoder-decoder based framework, NeuPath, where the algorithmic structure of PathSim is considered. Specifically, the encoder module identifies Top T optimized path instances, which can approximate the ground truth PathSim, and maps each path instance to an embedding vector. The decoder transforms each embedding vector into a scalar respectively, which identifies the similarity score. We perform extensive experiments on two real-world datasets in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Complex Network Analysis Techniques · Text and Document Classification Technologies
