Tree Edit Distance Learning via Adaptive Symbol Embeddings:   Supplementary Materials and Results

Benjamin Paa{\ss}en

arXiv:1805.07123·cs.LG·May 21, 2018

Tree Edit Distance Learning via Adaptive Symbol Embeddings: Supplementary Materials and Results

Benjamin Paa{\ss}en

PDF

Open Access

TL;DR

This paper introduces a novel metric learning method for trees that learns node embeddings to improve classification, outperforming existing approaches across diverse datasets including biomedical and natural language data.

Contribution

It proposes a new approach to learn tree edit distances indirectly through node embeddings, ensuring metric properties and better interpretability.

Findings

01

Outperforms state-of-the-art on six benchmark datasets

02

Effective across diverse domains including computer science and biomedical data

03

Scales to large datasets with over 300,000 nodes

Abstract

Metric learning has the aim to improve classification accuracy by learning a distance measure which brings data points from the same class closer together and pushes data points from different classes further apart. Recent research has demonstrated that metric learning approaches can also be applied to trees, such as molecular structures, abstract syntax trees of computer programs, or syntax trees of natural language, by learning the cost function of an edit distance, i.e. the costs of replacing, deleting, or inserting nodes in a tree. However, learning such costs directly may yield an edit distance which violates metric axioms, is challenging to interpret, and may not generalize well. In this contribution, we propose a novel metric learning approach for trees which learns an edit distance indirectly by embedding the tree nodes as vectors, such that the Euclidean distance between those…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Artificial Intelligence in Healthcare · Data Mining Algorithms and Applications