PoseLLM: Enhancing Language-Guided Human Pose Estimation with MLP Alignment
Dewen Zhang, Tahir Hussain, Wangpeng An, Hayaru Shouno

TL;DR
PoseLLM introduces a nonlinear MLP-based vision-language connector in a language-guided human pose estimation framework, significantly improving accuracy and generalization over previous linear projector methods.
Contribution
This work pioneers the use of a nonlinear MLP connector in a large language model-based pose estimation framework, enhancing spatial-textual interaction and localization precision.
Findings
Achieves 77.8 AP on COCO validation set, surpassing LocLLM by 0.4 AP.
Maintains strong zero-shot generalization on Human-Art and MPII datasets.
Demonstrates that a simple nonlinear connector boosts localization accuracy without losing generalization.
Abstract
Human pose estimation traditionally relies on architectures that encode keypoint priors, limiting their generalization to novel poses or unseen keypoints. Recent language-guided approaches like LocLLM reformulate keypoint localization as a vision-language task, enabling zero-shot generalization through textual descriptions. However, LocLLM's linear projector fails to capture complex spatial-textual interactions critical for high-precision localization. To address this, we propose PoseLLM, the first Large Language Model (LLM)-based pose estimation framework that replaces the linear projector with a nonlinear MLP vision-language connector. This lightweight two-layer MLP with GELU activation enables hierarchical cross-modal feature transformation, enhancing the fusion of visual patches and textual keypoint descriptions. Trained exclusively on COCO data, PoseLLM achieves 77.8 AP on the COCO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia?
