TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding
Chenchi Zhang, Jun Xiao, Lei Chen, Jian Shao, Long Chen

TL;DR
TreePrompt introduces a novel method for visual grounding that constructs interpretable prompts based on sentence syntax trees, enhancing transparency and understanding of the reasoning process in vision-language models.
Contribution
It proposes a new prompt construction paradigm that decomposes sentences into syntax trees and composes prompts bottom-up, improving interpretability over existing holistic prompt methods.
Findings
Effective across various backbones and benchmarks
Enhances interpretability of visual grounding models
Maintains competitive performance
Abstract
Prompt tuning has achieved great success in transferring the knowledge from large pretrained vision-language models into downstream tasks, and has dominated the performance on visual grounding (VG). However, almost all existing prompt tuning paradigms suffer from poor interpretability. In this paper, we argue that their poor interpretability is attributed to the holistic prompt generation and inference process. By "holistic", we mean that they usually directly learn a set of vectors as the prompt (i.e., prompt generation), and use the learned global prompt to augment the textual input for the VG model (i.e., prompt inference). To this end, we propose a new prompt construction paradigm with explicit explainable ability, named TreePrompt. Specifically, we first deconstruct a complex sentence into a tree, that is consistent with human reasoning. Then, following the syntax tree, we compose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
