PathVG: A New Benchmark and Dataset for Pathology Visual Grounding

Chunlin Zhong; Shuang Hao; Junhua Wu; Xiaona Chang; Jiwei Jiang; Xiu; Nie; He Tang; Xiang Bai

arXiv:2502.20869·cs.CV·March 3, 2025

PathVG: A New Benchmark and Dataset for Pathology Visual Grounding

Chunlin Zhong, Shuang Hao, Junhua Wu, Xiaona Chang, Jiwei Jiang, Xiu, Nie, He Tang, Xiang Bai

PDF

1 Datasets

TL;DR

PathVG introduces a new benchmark and dataset for pathology visual grounding, enabling region detection based on complex expressions, with a novel knowledge-enhanced model leveraging LLMs for improved performance.

Contribution

We present PathVG, a novel benchmark and dataset for pathology visual grounding, and propose PKNet, a knowledge-enhanced network utilizing LLMs to handle implicit pathological information.

Findings

01

PKNet achieves state-of-the-art performance on PathVG.

02

Implicit information in pathological expressions is a major challenge.

03

Knowledge enhancement improves visual grounding accuracy.

Abstract

With the rapid development of computational pathology, many AI-assisted diagnostic tasks have emerged. Cellular nuclei segmentation can segment various types of cells for downstream analysis, but it relies on predefined categories and lacks flexibility. Moreover, pathology visual question answering can perform image-level understanding but lacks region-level detection capability. To address this, we propose a new benchmark called Pathology Visual Grounding (PathVG), which aims to detect regions based on expressions with different attributes. To evaluate PathVG, we create a new dataset named RefPath which contains 27,610 images with 33,500 language-grounded boxes. Compared to visual grounding in other domains, PathVG presents pathological images at multi-scale and contains expressions with pathological knowledge. In the experimental study, we found that the biggest challenge was the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

fengluo/RefPath
dataset· 119 dl
119 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.