The Weight Function in the Subtree Kernel is Decisive

Romain Aza\"is; Florian Ingels

arXiv:1904.05421·stat.ML·April 14, 2022·5 cites

The Weight Function in the Subtree Kernel is Decisive

Romain Aza\"is, Florian Ingels

PDF

Open Access

TL;DR

This paper demonstrates that the choice of weight function in the subtree kernel significantly impacts performance, proposing a data-driven approach to optimize it, leading to improved classification results especially on small datasets.

Contribution

It introduces a unified framework for computing subtree kernels with learned weight functions, enhancing performance and interpretability in tree data analysis.

Findings

01

Performance improves when leaf weights vanish.

02

Data-driven weight learning outperforms fixed weights.

03

Effective on small datasets with high interpretability.

Abstract

Tree data are ubiquitous because they model a large variety of situations, e.g., the architecture of plants, the secondary structure of RNA, or the hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data is difficult per se. In this paper, we focus on the subtree kernel that is a convolution kernel for tree data introduced by Vishwanathan and Smola in the early 2000's. More precisely, we investigate the influence of the weight function from a theoretical perspective and in real data applications. We establish on a 2-classes stochastic model that the performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function, learned from the data and not fixed by the user as usually done. To this end, we define a unified framework for computing the subtree kernel from ordered or unordered trees, that is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Data Mining Algorithms and Applications · Advanced Clustering Algorithms Research

MethodsConvolution