Application of distances between terms for flat and hierarchical data
Jorge-Alonso Bedoya-Puerta, Jose Hernandez-Orallo

TL;DR
This paper explores the use of distances between terms for classifying flat and hierarchical data, demonstrating that term distances can enhance k-NN classification performance, especially when transforming flat data into XML hierarchical structures.
Contribution
It introduces a method to convert flat data into hierarchical XML format and applies term distances in k-NN classification, showing improvements over traditional methods.
Findings
Term distances can significantly improve classification results.
Transforming flat data into XML hierarchies benefits from term distance measures.
Experiments show advantages over Euclidean distance in certain cases.
Abstract
In machine learning, distance-based algorithms, and other approaches, use information that is represented by propositional data. However, this kind of representation can be quite restrictive and, in many cases, it requires more complex structures in order to represent data in a more natural way. Terms are the basis for functional and logic programming representation. Distances between terms are a useful tool not only to compare terms, but also to determine the search space in many of these applications. This dissertation applies distances between terms, exploiting the features of each distance and the possibility to compare from propositional data types to hierarchical representations. The distances between terms are applied through the k-NN (k-nearest neighbor) classification algorithm using XML as a common language representation. To be able to represent these data in an XML structure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Rough Sets and Fuzzy Logic · Advanced Database Systems and Queries
