COCOTree: A Dataset and Benchmark for Open Tree-Structured Visual Decomposition

Junhyub Lee; Seunghun Chae; Hyosu Kim

arXiv:2605.22068·cs.CV·May 22, 2026

COCOTree: A Dataset and Benchmark for Open Tree-Structured Visual Decomposition

Junhyub Lee, Seunghun Chae, Hyosu Kim

PDF

1 Repo

TL;DR

This paper introduces COCOTree, a large-scale dataset and benchmark for hierarchical image decomposition into visual components, utilizing automated annotation and a new evaluation metric.

Contribution

It presents a fully automated pipeline for creating a hierarchical visual decomposition dataset and establishes a standardized evaluation protocol for open tree-structured segmentation.

Findings

01

Constructed COCOTree with over 21K images and 1.8M nodes.

02

Achieved strong alignment of generated annotations with human judgment.

03

Proposed the OTQ metric for comprehensive evaluation.

Abstract

We formalize and enable the task of open tree decomposition, which segments an image into hierarchical trees of visual components with unconstrained granularity and flexibility. Specifically, we provide the foundation benchmark for this new paradigm with the following three key contributions. First, we overcome the prohibitively high cognitive and physical bottlenecks of manual annotation by developing a fully automated generation pipeline that synergizes the semantic reasoning of Large Vision-Language Models (LVLMs) with the precise geometric grounding of SAM 3. Second, leveraging this pipeline, we construct COCOTree, a massive-scale benchmark featuring over 21K images and 1.8M structural nodes. By embracing an open-vocabulary space of over 3.5K unique labels, it successfully captures the long-tail distribution of complex physical assemblies. Notably, rigorous human evaluation confirms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

melonkick3090/COCOTree
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.