Subtree Mode and Applications
Jialong Zhou, Ben Bals, Matei Tinca, Ai Guan, Panagiotis Charalampopoulos, Grigorios Loukides, Solon P. Pissis

TL;DR
This paper introduces an optimal algorithm for the Subtree Mode problem in leaf-colored trees, enabling efficient computation of the most frequent leaf colors for all nodes, with applications in text analytics and biology.
Contribution
The paper presents a time-optimal $O(N)$ algorithm for the Subtree Mode problem in leaf-colored trees, including adaptations for node-colored trees and top-$k$ frequent colors, with proven efficiency and practical validation.
Findings
Algorithm is faster than baselines by at least an order of magnitude.
The solution is highly space-efficient and scalable to billion-node trees.
Effective in applications like pattern mining, sequence search, and biological data analysis.
Abstract
The mode of a collection of values (i.e., the most frequent value in the collection) is a key summary statistic. Finding the mode in a given range of an array of values is thus of great importance, and constructing a data structure to solve this problem is in fact the well-known Range Mode problem. In this work, we introduce the Subtree Mode (SM) problem, the analogous problem in a leaf-colored tree, where the task is to compute the most frequent color in the leaves of the subtree of a given node. SM is motivated by several applications in domains such as text analytics and biology, where the data are hierarchical and can thus be represented as a (leaf-colored) tree. Our central contribution is a time-optimal algorithm for SM that computes the answer for every node of an input -node tree in time. We further show how our solution can be adapted for node-colored trees, or for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Graph Theory and Algorithms · Data Mining Algorithms and Applications
