A Simple Approximation Algorithm for Optimal Decision Tree

Zhengjia Zhuo; Viswanath Nagarajan

arXiv:2505.15641·cs.DS·May 22, 2025

A Simple Approximation Algorithm for Optimal Decision Tree

Zhengjia Zhuo, Viswanath Nagarajan

PDF

Open Access

TL;DR

This paper introduces a simple approximation algorithm for the optimal decision tree problem, achieving an $8 \, \ln m$ approximation ratio in a general setting with arbitrary costs, probabilities, and responses.

Contribution

It presents a straightforward algorithm with a simplified analysis that guarantees an $8 \, \ln m$ approximation ratio, improving understanding and implementation simplicity.

Findings

01

The algorithm achieves an $8 \, \ln m$ approximation ratio.

02

The approach simplifies previous complex algorithms and analyses.

03

It applies to the most general setting with arbitrary costs, probabilities, and responses.

Abstract

Optimal decision tree (\odt) is a fundamental problem arising in applications such as active learning, entity identification, and medical diagnosis. An instance of \odt is given by $m$ hypotheses, out of which an unknown ``true'' hypothesis is drawn according to some probability distribution. An algorithm needs to identify the true hypothesis by making queries: each query incurs a cost and has a known response for each hypothesis. The goal is to minimize the expected query cost to identify the true hypothesis. We consider the most general setting with arbitrary costs, probabilities and responses. \odt is NP-hard to approximate better than $ln m$ and there are $O (ln m)$ approximation algorithms known for it. However, these algorithms and/or their analyses are quite complex. Moreover, the leading constant factors are large. We provide a simple algorithm and analysis for \odt, proving an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Data Mining Algorithms and Applications