Structure-Preserving Graph Contrastive Learning for Mathematical Information Retrieval
Chun-Hsi Ku, Hung-Hsuan Chen

TL;DR
This paper proposes Variable Substitution, a domain-specific graph augmentation for contrastive learning that preserves mathematical formula semantics, significantly enhancing retrieval accuracy in mathematical information retrieval tasks.
Contribution
It introduces Variable Substitution as a novel augmentation technique that maintains algebraic structure, improving graph contrastive learning for mathematical formula retrieval.
Findings
Variable Substitution outperforms generic augmentation methods.
Significant improvement in retrieval performance.
Code released on GitHub for reproducibility.
Abstract
This paper introduces Variable Substitution as a domain-specific graph augmentation technique for graph contrastive learning (GCL) in the context of searching for mathematical formulas. Standard GCL augmentation techniques often distort the semantic meaning of mathematical formulas, particularly for small and highly structured graphs. Variable Substitution, on the other hand, preserves the core algebraic relationships and formula structure. To demonstrate the effectiveness of our technique, we apply it to a classic GCL-based retrieval model. Experiments show that this straightforward approach significantly improves retrieval performance compared to generic augmentation strategies. We release the code on GitHub.\footnote{https://github.com/lazywulf/formula_ret_aug}.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Mathematics, Computing, and Information Processing · Graph Theory and Algorithms
