LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model
Dongkai Wang, Shiyu Xuan, Shiliang Zhang

TL;DR
LocLLM introduces a novel approach to human keypoint localization by leveraging large language models and textual clues, enabling more flexible and generalizable localization, including unseen keypoints.
Contribution
This work is the first to utilize large language models for keypoint localization from images and text, enhancing flexibility and generalization beyond traditional priors.
Findings
Achieves state-of-the-art results on standard benchmarks.
Demonstrates superior cross-dataset generalization.
Effectively detects unseen keypoints during training.
Abstract
The capacity of existing human keypoint localization models is limited by keypoint priors provided by the training data. To alleviate this restriction and pursue more general model, this work studies keypoint localization from a different perspective by reasoning locations based on keypiont clues in text descriptions. We propose LocLLM, the first Large-Language Model (LLM) based keypoint localization model that takes images and text instructions as inputs and outputs the desired keypoint coordinates. LocLLM leverages the strong reasoning capability of LLM and clues of keypoint type, location, and relationship in textual descriptions for keypoint localization. To effectively tune LocLLM, we construct localization-based instruction conversations to connect keypoint description with corresponding coordinates in input image, and fine-tune the whole model in a parameter-efficient training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational and Technological Research · Text and Document Classification Technologies · Image Retrieval and Classification Techniques
